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Chapter 1 
Introductibn 



The problem of the study 

This is an investigation of the possibility of rlntroducing certain 
relatively novel types of test material to supplement or replace the Icinos 
of test items found in the written College Board achievement examinations 
in foreign languages# It should be said at the outset that for reasons Of 
research strategy the main emphasis of this study has been upon the improve-^ 
ment of tests concerned with the written aspect of a foreign language# 

The importance of improving methods of testing competence with the spoken 
language is well recognized^ but this study has devoted only minor attention 
to this problem because of the traditional concern of the College Entrance 
Examination Board with written tests# 

The interested reader can find examples of typical CEEB foreign 
language test items in a pamphlet published fron time to time by the Board, 
Forei gn Languages s A Description of the College Board Tests in French j Geimia n 
Latin, and Spanish # Objective test items of the sort found in this pamphlet 
characteristically yield highly satisfactory item-test correlations and are 
thus highly reliable, as has been pointed out by Paula Thibault ( 31 ). 

It seems to be the general opinion, also, that they are valid measurements 
of knowledge of a foreign language— at least of that kind of relatively 
intellectualized knowledge which is represented by the ability to state the 
meanings of foreign language words^ the ability to choose correct grammatical 
forms, and the ability to answer (questions based on printed prose paragraphs 
in the foreign language. 
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Use of these currently fashionable types of test items neverthlesS 
presents several diificulties, among ?iiich are the following t 

(1) As in many other fields, the construction of good items is an 
expensive, time-consuming process requiring the services of 
subject-matter specialists carefully trained in techniques of 
test contruction* 

(2) It has been argued that items tend to draw upon highly specific 
knowledges rather than upon a broad Competence with the language 
as a whole* (This argument possibly overlooks the fact that the 
items nevertheless constitute a valid sample of language habits*) 

(3) There is no satisfactory rationale for scaling the resulting test 
scores in terms of the extent to which native competence in the 
language is achieved* 

(h) it is possible that there is a ceiling effect iidiereby conventional 
item types tend not to measure accurately at the upper levels 
of ability, because they measure the mere existence of language 
habits rather than their strength# 

This study sought to investigate certain novel types of items which 
might circumvent some of the above difficulties and at the same time be 
characterised by satisfactory reliability and validity# 

These “novel" types of items utilise what Wilson L# Taylor (26) has 
called the "cloze" procedure (after the "closure" which is supposed to 
occur when the subject supplies a word, letter, or phrase to fill a lacuna 
in a continuous text# They call upon the examinee ^s acquired stock of 
habits relative to what may be called the statistical contingencies of a 
language. Continuous texts in a language can be shown to exhibit many 
kinds of statistical regularity. For example, the probability that a noun 



vdll follow ail adjective in certain contexts is very high. The probability 
that a vowel-letter will follow the letter sequence STR is well-high perfect 
%ether and how much these regularities function in the composing of 
sentences as they are uttered or written is open to considerable question^ 
nevertheless^ the hearer of an utterance or the reader of a text^ to the 
extent that he is well acquainted Yfith the language ^ will develop expect- 
ancies . as to what is going to follow any given part of the text# The more 
experience he has with the language, the better he will be able to predict 
how to^ complete a given context, or to fill in a lacunhi it seemed reason- 
able to suppose, therefore, that a valid method of testing competence with 
a language would be to ask the examinee to try to restore gaps or other 
types of mutilations in a text# 

It was recognized that this method of testing language competence might 
have its own share of difficulties. Above all, a characteristic of the 
method is that it frequently requires "guessing,” particularly in contexts 
Ttoere a number of different restorations of a text might be regarded as 
equally plausible or satisfactory# Under such circumstances it might be 
expected that even a person with a highly competent knowledge of a language 
might not be able to divine the '‘correct answer” (this being whatever stood 
in the original text)^ If guessing is frequent, reliability will Inevitably 
decline# 

Another difficulty is the very likely possibility that differences in 
"intelligence” or other extraneous dimensions of human variation might mask 
the variation in foreign lainguage competence which this type of test seeks 
to measure# If such were found to be the case, it would be necessary to 
consider the advisability of trying to adjust for the influence of this 
disturbing variable# 



Still another difficulty which wits seen to be characteristic of this 
method of measuring language competence is the fact that the ihdividual»a 
ability restore a text depends partly upon the nature and difficulty of 
the text itself* Indeed, Taylor developed the "cloze procedure" initially 
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it waa only at a later stage that he became intea^ested in the possibility 
of using the technique for measuring individual differencsa in ability#^ 
Despite these known technical difficulties, it was hoped that the 
bloze procedure in one hr more of its possible variants might bave certain' 
advantages at least as a supplementary technique for the measurement of 
language conpetence* If the <5loze procedure were found to be successfiil 
and If the technical difficulties could be obviated,, it might have the 
following advantages t 



^1) The technique offers a method of constructing test materials 
relatively cheaply and quickly* It would not call for so much time and 
attention on the part of subject-matter specialists as is the case with 
conventional test materials* One could readily construct materials for 
testing knowledge of any language provided appropriate texts were available 

(2) "Cloze" tests, might measure aspects of language con^etence 
somewhat different from those measured by conventional tests, ani it is 
cohceivahle that these competences might be regarded as more central to 
true mastery of the language* 

(3) % using certain concepts available from Shannon *8 ( 2k ) 
information theory, it might be possible to justify a rational scale of 
language competence extending from complete absence of competence up to 
native or native-like competence* 



(Ij) The tests might provide better differentiatiOh at the upper levels 
of ability^ discriminating between those with true mastery of a language 
and those with a more superficial knowledge of language patterns i 

lllstory ox the proble m 

The ”cloze‘^ procedure was first invented by the German psychologist 
JEbbinghaus around 1897* Most graduate students in educational psychology 
are told that Ebbinghaus invehted the ordinary “completion" or "fill-.in*^ 
item which Is so familiar itt tests of intelligence and achievement* 

Actuallj^ Ebbinghaus did no such things his kind of "completion*’ item was 
much closer to Taylor’s cloze procedure# His test consisted of large 
quantities of Gulliver’s Travels (in German translation) from which syllables 
had been more or less systematically deleted; the children were required 
to try to restore the original words# The test was intended to measure, 
not intelligence, but the degree of fatigue in mental funotions experienced 
by school children at different times during the school day* Ebbinghaus 
found the test to be a poor measure of fatigue, but he noted that the 
test had high relationships with both age and excellence in school perform- 
ance* (11) 

The cloze technique also bears some resemblance to the method of 
oontrolled association developed first by Thumb and Marb^ (see Viioodworth) 
D3,PP, 3I1O-367] for an account of the history of the controlled association 
method) in the sense that it requires the subject to give a verbal response 
under certain contextual constraints# Howes and Osgood ( 15 ) have 
demonstrated certain properties of these contextual constraints in a 
controlled association experiment# 






m 



i ¥ W] * ww h« ^-" "Jii\ " 



In 1939, Carroll (it) constructed a test of individual differences 
Trtiich also has some resemblance to the cloze item. This test, called 
”Phrase Completion”, presented to the sub;]ect short incomplete phrases like 

nuuuuo. itti u 

As for 

gmd asked the subject to iwrite the first response that came to mind. It 
was scored by community of response, that is, the response given to each 
item was weighted roughly in proportion to the frequency with idiich the 
response was given by a noxma '-ive sample. After refinement through item 
analysis, the test was used in a factor analysis study conducted with a 
college-student sample; it was found to be one of the purest measures of 
»V» (the verbal knowledge factor) available. It has since been used in 
several further factor studies ($, 10) and has consistently maintained its 
status as a pure measure of Its reliability, however, has not been 
satisfactory enough to recommend it for operational use. This is possibly 
because the test is too short and has not been subjected to any major effort 
to improve it. The reason for mentioning it hei*e is to point out that a 
test requiring a subject to supply verbal material to fit a context— ‘Where 
the ”correctness” of the response depends on the likelihood of the response 
in that context-«has been found to be a good measure of the examinee * s 
knowledge of his native language. 

With' the advent of ”information theory” as developed chiefly by 
Shannon (2U), other ways of asking subjects to respond to contextual 
constraints have been introduced and tried out. Much of the experimental 
work has been focussed on characteristics of texts and messages rather 
than on the differences in individual ability to respond. Shannon himself 
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( 23 } invented a special kind of guessing game for esiiiaating the * redundancy* 
of printed English; In this game^ one attempts to guess each successive 
letter of a text either on the basis of the text already exposed or on 
the basis of the preceding n letters* Shannon estimated that printed 
English is about 75^ redundant; that is^ in theory one should be able to 
recodo a text vrith about one-quarter of the number of characters and still 
conv^ the same information* Burton and Licklider ( 3 ) confirmed 
Shannon* 8 estimate^ using similar techniques* Chapanis ( 7 } showed that 
random deletion of anything beyond 2 $% of the characters of a text made 
it exceedingly difficult to be restored to its original state^ at least 
for most people* Miller and P^iedman ( 20 however^ have found that 
superior subjects^ given practically unlimited time^ can restore texts 
with up to approximately 50^ deleticns, and in view of certain other 
considerations this figure corresponds to a lower bound of about 60^ 
redundancy* It seems to be well established^ at any rate^ that printed 
English has a redundancy of somewhere above 50/?— certainly enough to make 
it feasible to construct tests requiring subjects to restore texts* 

All the studies mentioned in the preceding paragraph were concerned 
with the guessing of letters deleted from a printed text* To the writer’s 
knowledge^ there are no studies of the guessing of phonemes deleted from 
a spoken text^ although the studies of the intelligibility of speech with 
various kinds of interference as reviewed by Licklider and Miller (16) 
suggest clearly that speech has a high degree of redundancy* Miller and 
Selfridge ( 21 } performed a ’'Shannon guessing game" with printed words 
in order to construct artificial texts with various degrees of statistical 
approximation to English* 

In developing the "cloze technique"^ Wilson Taylor ( 26) 







•n0«» 

has used tests which involve the guessing of deleted words ^ Taylor^ s chief 
purpose was to arrive at a measure of ’readability* of prosea The procedure 
calls for systematically deleting words in the sample of prose one desires 
to measure and submitting it to a panel of 25 or so people who are asked to 
guess the omitted words# The readability score of the nfl.Qnficro •<« *.hav» 

computed as a function of the nuaiber of guesses which correspond perfectly 
to the omitted words in the original text# Headability scores determined 
by the cloze technique have been found to correlate well with those deter- 
mined by other techniques, and in some cases they seem to correspond better 
to consson sense assessment of readability* The cloze technique has also 
been shown to ”work well” in Korean (29) and presumably it would work in 
other languages— that is, as a measure of tbs relative readability of prose 
passages. Gofer (8, 9) has used the cloze technique for slightly different 
purposes; in one study he used, it to explore the characteristics of passages 
known to differ in adjective-verb quotient, and in another to •t pdex the 
trial-to-trial changes in the memorization of a prose passage. 

It is convenient to adopt Taylor’s term ”cloze procedure” to apply to 
all types of test in which the subject is given a text with certain indicated 
deletions and asked to try to restore the original text. (Deletions are 
always indicated by replacing the deleted item— letter or word— with a blank 
of standard size.) 

Individual differences in performance with the cloze procedure have 
been treated largely as a nuisance variable in many of the studies cited 
above. Shannon (23) dodged the problem of individual d5.fferenc&s by 
using mainly one subject— his wife. Miller and Friedman ( 20 ), like 
many others, report the central tendency but not the variation of performance 
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scores^ Actually they are more concerned with the "best** ot **maxlflit3.** 
degree ot performance obtainable. As early as 1953, however, Taylor 
perceived the potentiality of the cloae technique as a way of measuring 
individual differences in language skill. His study, finally published in 
1957 (30), showed that cloze scores attained by individual suhneGts were 
rather highly correlated with intelligence as measured by the Armed Forces 
Qualifidng Test as well as with con^jrehension of, and success in learning, 
prose paragraphs concerned with certain technical subjects^ Further test- 
retest reliabilities ranging from »7h to .88 were obtained by tests with 80 
items given in a bO-minute time limit. Thus far, Taylor *s study is the 
only published one of which the writer is aware that is explicitly concerned 
with cloze technique as a method of measuring individual differences. 

The idea of using cloze technique (in one or more of its possible 
variations) as a method of measuring foreign language ccmipetence is a rather 
obvious development from the other uses to which cloze technique has been put* 
Taylor himself suggested (29) that it could be used in this way, but the first 
public suggestion to this effect seems to have been made by Victor Yngve of 
Massachusetts Institute of Technology in some informal remarks at the North- 
east Conference on the Teaching of Modem Foreign Languages held at Brown 
University in the spring of 1954. Ingve*s proposal suggested that the 
guessxxig of ooitted letters in a foreign language text would be an effective 
technique. Bruner (2) seems to have been thinking along somewhat similar 
lines in ^peaking of an experiment which he and Robert harcourt conducted at 
an international seminar at Salzburg. These experiments tested Italian, 
German, Swedish, French, Dutch, and English speakers on their ability to 
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reproduce randcm strings of letters presented briefly, and third-order 
approximations to each of these languages. ”As you would expect," writes 
Bruner, "there was no difference in ^ility to handle random strings, but 
a real difference in ability, favoring one*s mother tongue, in reproducing 
nonsense in one’s own language." 

Overall design of the study 

The present study is timely in the sense that it is one of the first 
to make investigations of the technical characteristics of cloze-technique 
items as measuring devices for individual differences. Our knowledge 
of the measurement characteristics of the cloze item is scant indeed; 
this study was designed to fill some of the gaps in our knowledge of the 
cloze-technique as a measure of competence in English, despite the fact 
that the main focus of interest was in the construction of foreign 
language tests. 

The first step taken was to select t-i;!xts from 'vdiich test materials 
could be drawn. Since there was interest in comparing the effectiveness 
of the cloze procedure in various languages— English, French, and Getman— , 
it was thought necessary to assemble closely comparable materials in the 
three languages. CoB^arability was sought by locating materials existing in 
reasonably adequate translations in all three languages. From these 
materials a series of texts were drawn and made up into cloze tests in ail 
three languages. Two types of cloze test were prepared! (1) the conventional 
cloze test as used by Wilson Taylor involving deleted words . and (2) sequences 
of letters of lengths ^ Jj and ^ from which letters were deleted 
either initially, medially, or terminally. This latter type of test was 
one of the types used by Miller and Friedman ( 20 ) in their studies of the 



statistical redundancy of English* The development of the tests is 
described in Chapter 2. For convenience, the tests will be designated 
word»"Cloze and letber-cloze ^ respectively. 

An assumption underlying the measurement of foreign language competence 
is that the performance of adult native speakers of the language in 
question provides a criterion or standard against which the performance 
of the learner can be assessed. If one is truly measuring competence 
with a language as a general medium of communication, native speakers of 
a language are in some sense equal or uniform in their ability; linguists 
frequently seem to assume, at least, that native speakers are uniform in 
their knowledge of the language structure as the linguist defines it. 

Chapter 3 reports experiments designed to study the presumed degree of 
uniformity of language knowledge among native speakers. It also seeks to 
find out whether there are any differences in the redundanoy of English, 
French, and German, using bilingual speakers as their own controls. 

In the course of the experiments reported in Chapter 3, several 
disturbing methodological issues presented themselves. One, for example, 
had to do with the effect of context beyond the sentence. To what extent 
does the normal “logical" sequence of sentences in a paragraph supply 
cohtextual information over and above that contributed by the sentence 
itself? Another important question had to do with the extent to which 
cloze scores are affected by what may be called a "general ability to 
perform well on cloze tests," quite apart from knowledge of the language 
used in a particular cloze test. These and other problems are explored in 
Chapter I;. 

Chapter ^ reports the results of a series of tryouts of some of the 
experimental tests in foreign language classes in several secondary schools, 
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both public and private* Investigations of iton validity, test reliability, 
and test validity are showi, as well as correlations with teachers* grades, 
intelligence tests, and College Board tests. 

Chapter 6 reports a small exploration into the possibility of using 
cloze procedure with the spoken language. Since the procedure had scarcely 
been given a trial in any language, the experiment reported here was done 
on native speakers of English. One variable ?Mch was studied was the 
effect of manner of presentation— with and without "natural" sentence 
intonation. 

As the reader may have noted in the brief review of the literature, 
no investigator has attempted to adapt cloze technique items to the format 
of machine-scorable objective test items. Nor did the present study 
attempt to see whether this was feasible for foreign language tests. 

Instead, it was conceived primarily as a brief pilot stu^ to examine the 
general feasibility of the cloze technique, in several of its well-studied 
variants, as a method of measuring foreign language competence. 
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CHAPTER 2 

Development of "Cloze Froced'oi’e" Test Materials in 
English^ Frenchi and German 

Types of "cloze" procedure developed 

If "cloze procedure" masns any pro' edure in wliich cne asks subjects 
to restore a mutilated text^ one could legitimately ^imploy almost any set 
of rules for mutilating a text# One would merely have to choose a iinit 
for deletion and a rule for determining which units to delete# For example^ 
in dealing with a printed text one could decide to delete selected parts 
of individual letters— say, the ?vOwer half of the printed line, as has 
been done in certain experiments on reading (!• pp. 19li-195), or one might 
decide to delete the last inch of each printed line# It occurs to us to 
remark, incidentally, that "close procedux'o" is really nothing new for epi«* 
graphers and decipherers of ancient manuscripts, who have always had to 
contend \d.th the * deletions* caused by the ravages of time* 

For the present study, it was decided to sample the possible range of 
deletion systems at two levels^ the word^ *and the alphabetic character* 

In so doing, we were in effect following Taylor ( 26) , who had used word 
deletion, and Miller and Friedman ( 20 ), who studied letter deletion# 

It seems possible that these two levels of deletion might call upon different 
kinds of language competence* Restoration of texts with deleted letters 
would require laxowledge of the spelling of individual words and of the 
characteristic letter transitions of a written language system, while 
restoration of word deletions might depend rather on the ability to grasp 
the total meaning of wri*^.ben texts and on the availability of the 
individual* s vocabulary in the language* Testing at both levels would, 



it was hoped, provide a well rounded picture of the individual’s language 
competence as measured by the cloze procedure. 

Selection of textual materials 

The preparation of word-oloze tests (to be described in detail 
below) entailed the selection of a series of texts, '^here were two major 
considerations in the selection of texts* (1) texts were needed which 
existed in appropriate versions (the original, or translations) in all 
three languages for ^ich it was desired to develop tests, and (2) the 
texts should cover an appropriate range of difficulty. 

There were a number of reasons for deciding to select texts ^ch 
existed in all three languages with which the project was concerned. 

The primary reason was that our objective was to establish difficulty 
equivalences across languages; that is, we desired to know whether, other 
things being equal, French and German were, respectively, easier or more 
difficult than English for native speakers doing cloze-procedure tests* 

There was also the necessity of appraising an individual’s second-language 
competence relative to his competence in his native language, and such an 
appraisal, it was thought, could be best accomplished by having materials 
which could be established as being equivalent difficulty across languages 

Covering an appropriate range of difficulty was important because we 
foresaw the need of selecting materials which would be of the optimal level 
of difficulty for appraising the foreign language competence of secondary- 
school students, i^terials which proved to be too difficult even for 
discriminating among native speakers of a foreign language would surely 
prove to be too difficult for high-school students of that language* 
Difficulty * in this context, refers to the overall success or lack of 
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success -which subjects have in restoring a mutilated text. Taylor^s work 
with "readability" has demonstrated wide variations in the difficulty of 
passages, and it is his thesis that "readability" or "ease of conqprehension" 

xw*** M*WX>WVIA w\A W»w SA ^ AW V/A ^ OtWT V/W U^AAAAWVA 



by the cloze technique# 

Because of the considerable difficulty experienced in trying to locate 
text passages for which versions were available in English, French, and 
German, no attempt was made to sample the possible range of passage 
difficulty systematically. In anj'- case, prior to the actual tryout of the 
passages there -were no reliable measures of passage difficulty. To be sure, 
various readability formulas could have been applied to the passages in 
English, but they could not have been confidently applied to the passages 
in French or German. (Long after the passages had been selected, one of the 
original Flesch readability formulas was applied to the English passages, i 
with results as indicated in Table 2. 2. These results show that a -wide range 
vOf difficulty was co-vered, even though not completely and not systematically. 

After a considerable amount of searching, it proved possible to locate 
texts which were available in English, French, and German -version. Besides 
covering a reasonable range of difficulty levels, the searchers were able 
to secure for each language at least a few texts which -were originally 
composed in that language. In order to avo?.i bias of results due to 
subjects* prior knowledge of the selections, it was considered important to 
select texts which would probably not be too widely familiar. The standard 
classics were generally avoided, while contemporary literature and the 
lesser known works of well-known authors were given prominence. Both 
fictional and non-fictional materials were sampled. Table 2 J. presents a list 
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TABLE 2.1 

SOURCES OF *nvENTY PASSAGES FOUi^ IN ENGLISH. FEIENCH. AND OMAN VERSIONS 

The code designation of each passage consists of three parts s 
the symbol q (for *'cloze”)» a number from 1 to 20, and E, F, or G 

'PnT* Tn .QPV 0 *ral +.WP r\y mrmci 

from each source; these are listed corresponding page numbers; 
for each edition. 



Passage Author Reference 

Number 



1, Immanuel Kant 
2 



3 9 Alan Paton 

h 



$ Margaret B. Johnstone 



E. t Kant*s Prolegemena, edited in 

in iJhglish by Paul Cams, Chicago, 

Open Court Publ. Co. , 1912 
(qIE, pp. I6l-2j q 2E, pp. 82-3) 

F. s Prolegomenes, Paris, Libfaire 

Hachette et Cie, 1891. (glF, pp26l-2; 
q 2F, pp. 136-7) 

*0. s Prolegomena, heraus. v. *J. H. v. 
Kirchmann, Berlin, Philos .-histor. 
Verlag, 1893. (?1G> P* 150; q2G, 

pp. 76-7) 

*E.t Cry the Beloved Country, Chas. 
Scribner and Sons, New York, 1958* 

(q 3E, p. 238; qI^E, p. 130) 

F. : Pleure 0 Pays Bien AimI, Traduit 
de 1* anglais par Denise van Moppis, 
Albin i\4ichel, Paris, 1950. 

pp. 374-755 q4F, pp. 208-9) 

G. g Denn Sie Sollen Getrostet Viferdeh, 
Fischer Bucherei, Frankfurt/^., 19hl# 
(g3G, PP215-6; q 4G, p. 117) 

"The most valuable thing a man can 
spend," Reader’s Digest, August, 1957. 
(Condensed from Guideposts) 

(q$E, p. 82) 

F. j ^"Votre bien le plus precieux," 
Selection du Header's Digest, Oot., 
1957. (q 5F, p. 40) 



* Original version; i.e. the language of composition 



Passage 

Wumber 

5 



6 



7 



6 



9 , 

10 , 

u 
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TABLL 2»jL .(coi^tinued) 
Author Reference 



iiaargaret B« Johnstone 
(continued) 



G.t ”Unser kostbarstes Gut," Das best 
aus Reader’s Digest, Oct., 1^57 • 
(5^G, P- 97) 



Max Eastman *E.t "How human are animals?" Reader's 

Digest, Aug., 1957» (Condensed from 
Saturday Review) (q6B, p. 11$) 



F. t "Nos freres, les animaux," Selection 

du Reader’s Digest, Oct., 19$7» 

(96F, p. 100) 

G. 5 ”l?ilie menschlich sind doch Tiere I” 

Das bests aus Reader’s Digest, Oct., 
19$7. Oi^G, pp. $9-60) 

■Wolfgang Langewiesche *E.js "A new look at Niagara Falls." 

Reader’s Digest, Aug., 19$7. 

(q 7E, pp. I$9“*i60) 

F. t "Un regard neuf sur les Chutes du 
Niagara," Selection du Reader’s 
Digest, Oct., 19$7. (q 7^5 PP» 



G.s "Niafarafalle - geologisch gesehen," 
Das beste aus Reader* s Digest. 

Oct., 1957 • (q7G, PP» 91-2) 



Arthur C. Clark *E.s "Secrets of the Sun," Reader’s 

Digest, Aug., 19$7. (Condensed 
from Holiday) (^E, p. 206) 

F. s "Les Secrets du Soleil," Selection 

du Reader’s Digest, Oct., 19$7« 

(q8F, pp. 1-2) 

G. t "Geheimnisvolle Sonne," Das beste 

aus Reader’s Digest, Oct., 19$7» 
(98G, pp. I1I-2) 

Georg Bernanos E. ; The Diary of a Country Priest, 

Translated from the French by 
Pamela Morris, McMillan Co., New 
York, 19$6. (§9B, pp. 22-3) 

q3.0E, Po 237; qUE, P. h6) 

*F. 5 Joui'nal d’un Cure de Campagne, 
Plon, Paris, 1936. (q 9F, pp. 23-U; 
5IOF, ppc 202-3; q HE, pp. li3~U) 



Table 2.l(eontinaed) 



Passage 

Number 

9 

10 
11 

12 , 

13 



14 , 

15 , 

16, 
17 



18 , 

19, 

20 



Author 



Reference 



Georg Bernanos 
(continued) 



G» rIvIsz 



Marjorie K. Rawlings 



G« F* Ramuz 



G*s Tagebuch eines Landpfarrers, 

Fischer Bucherei, 19 ^ 6 . (^90, pp. 32-3) 
qIOG, pp, 292-3S qIIG, pp, 6^-5) 

E» s The Origins and IVe— history of 
Language, Translated from the German 
by J, Butler, Philosophical library, 
Inc,, 1956, (gl2E, pp, 7 - 8 ; 

?13E, p, 158) 

F,t Origine et Prehistoire du Langage, 
Traduction de L, Homberger, Payot, 
Paris, 1915* (cl2F, pp, 15«6: 

^15r., P. 162) 

*G, t Ursprung und Vorgeschichte der 
Sprache, A, Franke A.G., Bern, 1946* 
(q 1.2G, pp, 18-9; q 13G, pp, 190-1) 

*Eo 5 The Yearling, Charles Scribner and 
Sons, New York, 1939* (gl4E, pp* 22-3 
9 I 5 E, pp, j98-'9; q 15 E, p, 200 ; 

9 I 7 E, p, 348) 

F, t Jody et le Faune, Roman traduit de 
I’Americain par Denise Van Mopp^s, 

Albin Michel, Paris, 1946, 

(qliiF, p, 28; ( 5 I 5 F, p, 400; 

< 3 l 6 F, pp, 204-5; (jl7F, pp, 353-4) 

G, s Frlihling des Lebens, Ubersetzung 
von Maria Honeit, Rowohzt Taschenbuch, 
Hamburg, 1955* ( 14G, p. 22; 

q 15G, p, 37 O; qI^G, p, 184; 
gl7G, p, 324) 

E ,5 Vilhen the Mountain Fell, Translated 
from the French by Sarah Scott, 

Pantheon Books, Inc,, 1947e 
gl 8 E, pp, 28-9; q 19E, pp, lOD-^; 
q 20E, p, 194) 

*F,t Derborence, B, Grasset, Paris, 1936, 
(q18F, ppo 32-3; q 19F, pp, 118-9; 
q 20 F, pp, 227 - 8 ) 

G,: Der Bergsturz, R, Piper and Co, 
MUnchen, 1936, (q 18G, pp, 24-5; 

gl9G, p, 90; q 20G, pp, 173-4) 



V# 



TABLE 2.2 



Passages arranged according to 
the Plesch readability score 
of the English version 



Passage 

Number 


Author 


Blesch Readability 
Score •'Jt* 


Classification 


17 


Rawlings 


*6 


Reading Grades 5«9 and below 


Ih 


Rawlings 


.7 


Descriptions Very easy, 
(difficulty of "light novel”) 


16 


Rawlings 


.8 




3 


Paton 


1.6 




20 


Ramuz 


1.9 


Reading Grades 6.0 to 6.9 


15 


Rawlings 


2.0 


Typical magazines True Story 


7 


Langewiesche (RD)* 


2.1 


Descriptions Easy. 


k 


Paton 


2.3 




11 


Bernanos 


2.3 




10 


Bernanos 


2.7 


Reading Grades 7*0 to 7«9 


9 


Bernanos 


3.3 


Typical Magazine j Liberty 
Descriptions Fairly easy.,*:. 



19 


Ramuz 


3.9 


Reading Grades 8.0 to 9.9 


8 


Clark (RD)^ 


il.O 


Typical Magazines Feeder’s 




Johnstone (RD)* 


h.l 


Digest 


5 


Descriptions Average 


6 






difficulty* 


Eastman (RD)* 


h.2 




* (RD) indicates selection from Reader’s Digest 

** The Flesch readability score is derived from the formulas (12) 



Readability score = ©1338Xg + - e06^S>Xji «-#7502, 

vhere Xg is a measure of words per sentence, 

% is a measure of the prefixing and affixing of the words, 

Xhis a measure of the number of personal, human interest, references* 



M in^ , J.I III 
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TABLE 2,2 



Passage 

Number 


Author 


Flesch Headability 
score 


Classification 


1 0 


voo;p 


4.5 


Heading Grade a 10 to 12«,9 


18 ,. .. 


Ramuz 


5.2 


laical magazine: Harper’s 


13 


Hevesz 


5.7 


Magazine 

Description: Fairly difficult 


1 


Kant 


6,9 


Reading Grade: 17.0 and above 


o 


Kant 


6.9 


(college graduate) 




Typical magazine: Scientific 








Monthly 








Description: Very difficult. 



i. n3Sj<*«“ 
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of the sources finally chosen, with "ohe exact page numbers on which the 
passages were found. Twenty passages in all, from 10 different authors, were 
made the basis of the word-cloze tests. Table 2.2 presents infoimation showing 
the range of difficulty obtained for the passages in English. 

Preparation of word-cloze tests 

Two decisions have to be made in preparing a word-cloze test: (1) the 
length of continuous text from which deletions are to be made, and (2) the rule 
for deleting words within a stretch of text. 

Most of Taylor’s work with word-cloze has been concerned with the “read- 
ability” of texts over sizeable stretches, 175 words and up. He has thus 
allowed the subject to take advantage of the total force of context accumula- 
ting in rather lengthy passages. In preparing word-cloze tests in foreign 
languages, we expected many of our subjects to have competence far short of 
that possessed by native speakers, and we therefore desired to make the tests 
relatively ’easy’j one way of doing this was to use relatively lengthy 
passages, (205 words in length) from which subjects would be able to derive 
abundant contextual clues. Use of passages any longer than 205 words would 
have had the disadvantages of lengthening the total test beyond reasonable 
bounds, and incidentally of making some passages occupy more than a single 
double-spaced page. 

The other decision which had to be made was that of choosing a rule 
for deletion. Taylor has tried various rules of deletion ranging from one 
word in five to one word in ten; for the measurement of prose readability 
he prefers to delete something like one word in every five or seven. 
Nevertheless, we believed that a lower rate of deletion might be more 
effective for measuring individual differences in language skill, desiring 
if anything to err on the side of giving the examinee abundant evidence on 



which to base his guesses* Further, we had no preconception as to what 
kind of deletion— in terms of * parts of speech*, v/ord frequency, or other 
considerations— might be most effective; indeed, we wished to sample all 
kinds of word-deletions* In view of these considerations, it was decided 
to delete every tenth word* It should be stated here, however, that the 
present study has not attempted, and does not pretend, to provide a definitive 
answer regarding the optimal rate of deletion for measuring ind:.vi.dual 
differences* 

Criteria were established for counting words* In English ind German, 
words were identified as strings of letters (including the apostrophe for 
contraction) preceded and followed by a space or mark of punctuation. In 
French, these criteria were slightly modified so as to count elided words 
like d* (as in d» argent) as separate words* Hyphenated words were always 
counted as two words, and occasionally a deletion fell on one member of a 
hyphenated word. 

The 20$-word passages always started at the beginning of a paragraph; 
naturally, they frequently broke off somewhere before the end of a complete 
sentence. Although the same passages were chosen in all three languages, 
it was hardly to be expected that they would cover the same amount of 
meaning-content* There did not seem to be any consistent parametric differ- 
ences between languages, i.e. each language took about the same number of 
words to *'say the same thing.’* Such variation as occurred could probably be 
attributed to idiosyncrasies of the writers and translators of the texts 
rather than to regular differences in language structure. 

Each passage was mimeographed on a separate sheet. Counting fi’om the 
first word, words 10, 20, 30, ••«, 200 were deleted and replaced by a blank 
(using the ’’underline” character of a typewriter) of ten typewriter spaces. 
The blanks in each passage were numbered consecutively fl*om 1 through 20* 
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Thus, all blanks except number 20 had nine words of unmutxlated text 
both before and after it* In instances where a deletion would have fallen 
on a number, numerical expression, or proper name, the next word was de- 
leted instead (without affecting the subsequent counting)*^ 

iiach version (English, French, and German) of each of the 20 passages^ 
was subjected to the word-deletion procedure once; the resulting mutilated 
passages were used in the experiments in various combinations, as described 
in subsequent chapters. 

The word-cloze tests were scored by counting the number of words for 
each passage which ex actly corresponded to the words found in the original 
(completions which had even minor erroi’s of spelling were counted wrong) « 
According to Taylor (26), this scoring procedure is as effective as any. 

On a priori grounds, hovrever, a case could be made for scoring in terms of 
community of response, i,e, giving a positive weight to any response of 
high frequency in a normative sample. Such a scoring procedure might 
enhance reliability and even validity because the ’^correct” response would 
correspond to a sort of linguistic norm rather than to the possible idio- 
syncratic item found in the original text. It shoula be remembered, also. 



^ There are three instances in the corpus of test material where the 
deletion was inadvertently allowed to fail on proper names. In these cases 
the scoring admitted as correct either the exact proper name in the original, 
or an appropriate pronoun, 

2 The first 2 passages (from Kant) were omitted in the English-French 
experiment reported in Chapter in because of their extreme difficulty. 



that Taylor’s recommendations are based largely on hie work with the 
measurement of "readability” rather than individual differences in per- 
formance* To be sure, the ’readability* of a passage might quite well 
depend on the predictability of the words composing it, but the ability of 
readers to predict those words, depending rather on their acquaintance with 
the overall statistical characteristics of a language, might be better 
tested by means of a community-of-response scoring scheme* Some trials of 
such a scheme are reported in subsequent chapters* Nevertheless, it should 
be pointed out that the use of community— of— response scoring destroys one of 
the chief advantages which might be claimed for the cloze procedure, namely, 
that it enables one to dispense with extensive test construction and item 
analysis procedures, 

letter-cloze tests- 

As in the case of word-cloze * tests, comparability with the work of 
previous investigators was maintained also for the letter-cloze tests 
used in this experiment* In this case we replicated that procedure of 
Miller and Friedman ( 20 ) factually Miller and Friedman studied several 
mutilation procedures ]} Tidiereby strings of letters selected randomly fi*om teAts 
are presented with a blank indicating one letter missing, the sub;ject being 
required to predict the missing letter* Miller and- Friedman used strings 
5, 7> and 11 characters in length and obtained data for each possible 
position of deletion. Ability to predict the missing character was found 
to be maximal for letters deleted towards the middle of the segment, falling 
off gradually as letters were deleted toward the beginning or toward the 
final position* 

Because this stucfy was concerned chiefly with the measurement of 
individual differences it was considered unnecessary to test Miller and 



Friedman’s method quite as exhaustively as they did. Still using strings 



middle^ and the end of the string- 

As in the v/ord-deletion experiment, it was desired to compare the 



not seem to be any good reason to draw the letter-deletion material from the 
texts used in the word-da?jsiion tests, since it did not seem likely that the 
difficulty of letter-deletion material would much affected by the overall 
difficulty level of the text* Instead, the uiiglish materials were drawn 
from those which had already been studied by Miller and Friedman. 5'or the 
French and German, strings were dra^■»n by a more or less random procedure 
from the October 1957 edition of the Reader’s Digest in those languages© 
( Selection du Reader’s Digest and Das Beste aus Reader’s Digest ) The random 
procedure consisted of drawing a diagonal line down a page of text and 
copying the string intersected by this line in each successive line of text* 
Miller and Friedman excluded strings which contained proper names, numerals, 
and punctuation marks (admitting a space between a word, however, as a 
27th ’’character” in the alphabet)* i"he same procedure was followed for 
French and German, except that in French, the apostrophe was permitted as 
a character* 

In each language, the test booklets contained 9 pages, each page 
containing 50 items of a given length (5i 7i H chara .a) or place of 
deletiCn (initial, medial, or final)* ^he order of pfj^es for each language 
whs 11a, lib, lic> 7a, 7b> 7«J> 5a, 5b> 5c> where *b’, and *c’ denbte, 
respectively, deletion of initial, medial, and final character* The English 
pages appeared first in each booklet but subjects were told they might do 
the pages in any order they pleased* At the top of each page were printed 



of 5^ 7> and 11 characters, we deleted letters only at the beginning, the 
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all the distinct permissible sj-mbols for the language used there (including 
* for space between words )«, This meant that in French the subjects attention 
was drawn to the fact that such symbols as c, q, e, h, e, Z, were to 

be regarded as different^*, in German, subjects were shown a, a, o, b, u, U 
as different symbols,* Instructions and sample pages for the letter-deletion 
tests are to be found in the Appendix* 

The scores for the letter-deletion test were the number of letters 
which exactly coincided with the letters deleted in the original text* 

The letter had to be supplied with the exact diacritical marking, if any. 

The usefulness of a scoring system based on community*^ of —response was also 
investigated, as reported in Chapter 5 • This was done despite the 
consideration stated earlier in this chapter that the use of item analysis 
procedures (necessary to establish community-of-response scoring) tends to 
vitiate some of the advantages claimed for the cloze procedure* 



^ Inadvertently, a was omitted from the French character ].ist* This 
* omission did not seem to inhibit anybody*s using a when it was required* 

The a was never a correct answer. Nevertheless, it was used at least once. 
The error was rectified in the high-school testing. 
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CHAPTER 3 

Try-oui of Written Tests with English-French 
and English-German Bilinguals 

In theory, or at least In a comfortably idc , world, competence in a 
foreign language should yield maximum and nearay uniform scores for native 
speakers of that language^ because nearly evexy native speaker of a language 
acquires a certain set of language habits which ai’e prerequisite to his 
functioning as an effective member of a speech community* The extent to 
which he goes beyond this minimal set of language habits is presmably a 
function of his general intelligence and his degree of education, but an 
ideal test of language competence should be unaffected by intelligence and 
degree of education* 

Unfortunately, one can hardly expect to achieve this ideal. The very 
process of testing ordinarily calls upon certain cognitive skills which are 
independent of language competence and which, if present, will almost 
inevitably heighten test performance* If we can assume, however, that the 
individuals who are likely to take College Board foreign language examin- 
ations are relatively uniform in intelligence, degree of education, etc., 

—relatively, that is, in comparison with the total population— it makes 

« 

sense to compare their performance with that of native speakers of the 
language who have somewhat similar degrees of intelligence and education* 

(We speak of ^'degree of education'* not in the formal sense of number of years 
of education completed, but in the sense of general achievement o } This 
chapter reports some of the results obtained by applying cloze procedure 
tests to such samples of native speakers of French and Geman as we were able 
to obtain in the metropolitan Boston area* Because of the methods of re- 
cruitment employed, we found ourselves with data from a number of individuals 
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whose native language was English but who also had high competence in French 
or German* Data from these subjects were found to be valuable as control 
data> and in any case, there was interest in seeing v;hether use of the cloze 
pi‘ocedure would clearly distinguish between native speakers of a language 
and persons who have acquired a high degree of competence in the language as 
a second language* The bilinguality of the subjects was therefore a crucial 
factor in the design and analysis of this experiment « 

Subjects 

Subjects were obtained largely through the cooperation of asscciatAinno 
of teachers of French and of German in the metropolitan Boston area* In each 
case, representatives of the project visited a regu?.ar meeting of the assoc- 
iation and asked for volunteers to come to Harvard for That was estimated to 
be a two-hour testing session* Members attending-ihe''^eting^wSj*e told that 
persons with a good knowledge of French or German were desired, with a pre- 
ference for individuals with native facility* Emphasis was put on the necess- 
ity .of establishing standards of performance on certain tests being considered 
for possible use in the College Board program* 

Some Individuals were recruited by a variety of other means— e*g. by 
inviting volunteers to ask their friends or relatives to participate, and by 
mail solicitation of persons listed as high school teachers of German* On 
the whole, it proved to be difficult to obtain any large number of subjects, 
and the persons actually obtained as volunteers were recruited only after 
rather considerable effort* (For example, the principal investigator went so 
far as to write and deliver a speech in German in order to comply with the 
requirement of the German teachers' association that no English be used at 
meetings*) 

The total number of volunteers obtained in French was 17; in German, 22* 
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Y/hen the volunteers arrived for testing, they were first asked to fill out 
a questionnaire (See Appendix A ) regarding native language, degree of 
education^ and other relevant information. On the basis of their responses 
to a question regarding relative fluency in their two languages, subjects 
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instance in which a subject reported that he was equally fluent in two 
languages, he was classified according to the country of his birth and youth. 
Several Individuals were very difficult to classify because of very early 
bilingualism. At any rate, it can be safely said that every person listed 
as a native speaker of a language had spoken that language since early child- 
hood, and on the record alone, many such individuals also had “native** or 
near*^native proficiency in one or more other languages. 

After all testing was completed, data were complete for the following 
numbers of cases, all of whom were adults (the numbers in parentheses are the 
number of cases who also completed the letter-cloze materials)* 



'' 


* English-French 


English-German 


Grand 




Bilinguals 


Bilinguals 


Total 


Native German 
or Native French 


Men V/omen Total 
2(1) 2(2) li(3) 


Men Wcman Total 
5(lt) 7(1;) 12(8) 


16(10) 


Native English 


6(6) 6(6) 12(12) 


5(3) 1(1) 10(U) 


22a?) 


Total 


6(7) 8(8) 16(15) 


H;(7) 8(5) 22(12) 


38(27) 



All subjects, both native English and native French or German, had 
attained a relatively high level of formal education; they all had at least 
bachelor’s degrees or the European equivalent. Several had master’s or 
doctor’s degrees or were in the process of acquiring them. The age range of 
the E^ench subjects was approximately 22 to 60 years; of the German subjects, 
2li to 60 years. Typically, subjects had acquired a knowledge of their second 
language by formal courses in it, although a few had acquired the second 
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language (English) solely by absorbing it in the course of living and working 
in English-speaking countries* Most of the native English speakers had 
traveled or lived abroad in the countries of their second language* 

In all cases, the subjects reported they were currently using both their 



languages 



in 
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of the experiment was fulfilled as well as might be expected; that is to say, 
the degree of bilingualism of these groups made it possible to make a mean- 
ingful comparison of performances in two languages, ioe* French vs* English, 
and German vs* English* (There were one or two trilinguals in the group, but 
it was not considered worthwhile to take advantage of this, particularly 
since the third language was reportedly not as well control3.ed by these 
individuals* ) 



Experimental design 

For both the word-cloze and letter-cloze material, it was desired to 
compare subjects* performance in English and in the second language* In the 
case of letter-cloze material, it was possible to provide such an extensive 
sample of material in either language that there was no problem of insuring 
comparability of the basic material across languages, and it was not even 
considered necessary to use the same basic material (the same texts in 
different languages, that is to say) across languages* Therefore, all 
subjects were simply given all letter-cloze materials in both langu.ages* 

The letter-cloze materials themselves varied in both length of string and 
position of deletion, as described in Chapter 2* 

The case was somewhat different for the word-cloze materials* It was 
desired to limit the testing session to approximately two hours, and it was 
estimated that the subjects could not comfortably complete work on more than 
about 20 passages (each with 20 del.etions from a text originally 20$ words 
in length) in that time* Furthermore, it was obviously undesirable to give 






a siibject the same passage in both languages# The experimental design 
therefore called upon each sub^ject to do 10 passages in English and 10 other 
passages in the other language; subjects were paired, however, so that any 
passage which was done in English by one subject was done in the other 
language by the other subject# The assignment of passages in their English 
and foreign language versions was done by a randomizing procedure which 
varied from subject to subject, and the order of presentation of the passages 
in a test booklet was also randomized to help cancel systematic effects of 
practice and fatigue* Thus, each subject in the experiment had a uniquely- 
composed booklet# This design made it possible not only to assess the per- 
formance of the subjects in two languages but also to assess the comparative 
difficulty of the passages in the two languages and the comparative redun- 
dancy of the languages themselves. 

Eaab subject ccmipleted three distinct instruments* 

(1) A questionnaire, mentioned previously, to yield data on how well 
the subjects knew the languages and how they acquired this knowledge; 

(2) Twenty (for the Germau-Eng3.ish group) or 18 (in the English- 
French group) "word-cloze” passages, half in English and half in the second 
language; 

(5) Eighteen sets of letter-cloze materials, of which nine were in 
English and nine were in the second language^ 

The first two of these instruments were filled out in group testing 
sessions at Harvard conducted by research assistants# Sessions for French 
end Geiuian were held at different times; because of the exigencies of 
scheduling it was necessary to g5.ve subjects t'ie option of coming at one or 
another of several di-fferent sessions* During the intermission of each 









testing session, subjects were given instructions and materials for com- 
pleting the letter-cloze materials^ they were asked to do this work at home 
and return it by mail. (There v/as a loss of one French case and ten German 
cases for the letter-cloze experiment on this accounts } 

The detailed procedures followed in the testing will now be described* 
In group testing, subjects were seated around a large table and given 
a brief explanation of the purpose of the experiment. They then filled out 
the questionnaire, after which they were told to read the instructions for 
the test. The text of these instructions follows t 

”In the passages that follow, every tenth word has been deleted* 
Please fill in each blank with the word that seems most probable tc yoUc 
Do not try to be imaginative 5 put in the word that you think most people 
would put in that space. 0n3.y one word has been deleted each time, and 
all punctuation has been left in. Itjrphenated words count as two words. 

1 /i/e recommend the following procedure? 

1. Read the whole passage through first without putting down any 
words. 

2. Fill in a3.1 the bl.ani:s which seem fairly obvi.ous and put down 
two or three alternative answers for the rest. 

3. Select the most suitable alternative and reread the passage 
to be sure it makes sense. 

ij. Viiork quickly j do not spend too much time on any one passage.” 
(These instructions were original with this project. At the time they 
were prepared, Taylor* s instructions were not available. Since then, compar- 
ison of these and Taylor* s instructions shows only that our instructions lay 
more stress on trying to get the whole meaning of the paragraph before 
starting to fill in blanks.) 

A few minutes were given for subjects to ask questions about the pro- 
cedure. Subjects often asked whether they could go back to a passage after 
they had left it and gone on to another; they were told that they should go 
back to passages they wanted to work over only after they had finished the 
whole set of passages. 

Before allowing the subjects to proceed with their test booklets, the 
examiner made the following additional remarks? 
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"This is not a timed test# You are free to proceed at your own 
rate. You will find that some passages are harder and will take longer 
than others. We would, however, like to get an idea of how long each 
passage takes. Thus, if you will raise your hand and signal to the 
examiner when you have finished a passage and are ready to go on with 
the next passage, he will be able to jot your time down.” 

The examiner then said "Ready, begin V* and started his clock. As the 

subjects finished a passage, they signaled and went on to the next passage* 

The examiner kept a record of times for each subject. 

In a few instances, subjects did not adhere to the timing instructions. 

They either neglected to raise their hands or they went through the test 

books doing the passages in one language first and then doing the passages 

in the other language. In these rare cases, the time record of the subject 

was not kept. In the first testing session it was feared that some subjects 

would fall far behind the others and take very long with the test. Thus 

after a rate for most subjects was ascertained— about 6 to 7 minutes per 

passage— the examiner once in a while suggested ^ ’You should be up to 

passage number ••.* whenever he felt that one or another subject was falling 

too far behind. After the first session it was realized that subjects tended 

to finish booklets at about the same time and that some tended to start 

working quickly without accelerating their rates while others would start 

working slowly and get much faster as time went along. No further use was 

made of the times except to get a general idea of the time it took to do a 

passage. It may be said that the tests were administered under essentially 

work-limit conditions. 

About half-way through the testing session there was a break for 
refreshments, during which the nature of the letter-cloze experiment was 
explained to the subjects. Informal questioning at the end of the test 
session revealed that the subjects found the test interesting and 
challenging but that they did not feel that they had done very well. 
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The entire testing session, including the introductory remarks and the 
break took about two and a half or three hours* 

Overview of results of v/ord-cloze tests 

The main purpose of this experiment vias to extablish normative data for 

w v/J» V* VXA UXiC UC70U0# 

These normative data are presented in Tables 3*3 to 3*5^ which contain the 
mean scores and other statistics for each passage as performed by one or the 
other of the several groups which were studied* 

Passages 1 and 2 proved to be excessively difficult for the German 
speakers; consquently they were not used at all in the French group* Certain 
data on these passages will be presented, for completeness, but they will in 
general be excluded frcsn statistical summaries* 

For individuals doing passages in their native language , the mean 
passage scores (exclusive of passage 1 and 2 means) range in the following 
wayt (Maximum possible score is 20 in each case*) 

English 8*9 to 16*0 with a median at 11*9 

French 9*5 to 17 *0 with a median at 11*7 

German 8oO to 16 «0 with a median at 11*7 

For individuals doing passages in a second language the results are as 
follows } 

French-speaking doing English: 6*0 to 15*0 with a median at 11*7 

German-speaking doing English: 6*9 to 12*8 with a median at 9*1 

English-speaking doing French: 7*3 to 17*2 with a median at 12*2 

English-speaking doing German: to l5o2 with a median at 8*^ 

It is evident that the passages vary considerably in difficulty, as 
might be expected* They also differ in their reliabilities, as indicated by 
their correlations with total score. 
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TABLB 3.2 

Means; Standard Deviations and Correlations of French Passages 
fdth Bemainlng Total Scores in F^nch 



Passage 

No« 


Native English 
Subjects 
N a .6 


Native French 
Subjects 
N « 2 




All Subjects 
Combined 
N = 8 


s 


X 


- . i 


8 


r 


3 


li».2 


15.5 


15.0 


1.51 


.25 


U 


12»2 


15.0 


12*9 


1.81 


.36 


5 


12*7 


il;.0 


13.0 


1.85 


.38 


6 


11.8 


11.5 


U.8 


1.16 


.17 


7 


9*0 


10.0 


9.2 


2.25 


.24 


6 


12.2 


U.0 


11.9 


1.96 


.63 


P 


7.3 


10.5 


8.1 


2.03 


.71 


1C 


12.5 


12.0 


12.4 


2.20 


.65 


u 


13.8 


10.5 


11*8 


1.67 


.70 


12 


11.2 


11.5 


11.2 


2.38 


.56 


13 


13.0 


lit.O 


13.2 


1.1»9 


•82 


lU 


10.2 


10.0 


10.1 


.99 


.19 


1$ 


12.3 


12.5 


12.4 


1.19 


.43 


16 


lli.2 


13.0 


13.9 


1.81 


.27 


17 


10.0 


9.5 


9.9 


2.17 


.69 


Id 


8.3 


11.0 


9*0 


2.51 


.81 


19 


10.7 


12.0 


11.0 


1.51 


.53 


20 


17.2 


17.0 


17.1 


1.36 


-.16 
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TABLE 3.3 

. Means, Standard Deviations, and Correlations of German Passages 
with Remaining Total Scores in German 



fir**— 

Passage 

No. 


N 


English Subjects 
x: 8 r 


Geman Subjects 
N X s ; 


r 


All Subjects Ccnbined 
N jr s 


r 


1 


< 


1. A 


0 TT 






A A 


A 
















«4#v/ 




• 7^ 


u 


o.u 


<:.yo 


•90 


11 


6.5 


2.98 


.92 


2 


5 


U;8 


3*33 


.63 


6 


1*.3 


1.86 


•80 


11 


i*.5 


2.3lt 


.53 


3 


6 


15.2 


1.9lt 


.73 


5 


16.0 


1.87 


.12 


11 


15.5 


1,86 


.52 


It 


5 


9.6 


2.19 


.66 


6 


U.8 


1.60 -.30 


11 


10.8 


2.U( 


.67 


5 


6 


.8.7 


2.80 


.82 


5 


ll*«l* 


2.70 


.25 


11 


11*3 


3.98 


•80 


6 


It 


9i2 


1.71 


.87 


7 


10,3 


1.93 


.75 


li 


9*9 


1.87 


.78 


7 


h 


8t0 


3.37 


.81* 


7 


9.9 


2.3lt 


.57 


11 


9.1 


2.75 


.70 


6 


6 


8.3 


3.27 


.88 


5 


Ut.6 


1.67 


.86 


11 


11.2 


It.Ut 


.92 


9 


5 


6.6 


2.70 


.82 


6 


8.7 


2.25 


.90 


11 


7.7 


2.57 


.87 


10 


5 


5.6 


2.30 


.65 


6 


8.3 


2.58 


.61* 


11 


7.1 


2.7lt 


.72 


11 


6 


7.8 


lt.l7 


.95 


5 


11.6 


1.82 


.52 


u 


9.5 


3.72 


.91 


12 


5' 


6.6 


2.7lt 


.71* 


6 


9.0 


1.67 


.80 


u 


7.9 


2.16 


.77 


13 


2 


6.0 


2.83 


— 


9 


8.3 


2.55 


.76 


11 


7.9 


2.62 


.78 


Ik 


k 


13.0 


2,16 


♦95 


7 


Ut.6 


1.27 


.16 


11 


li*.0 


1.73 


.71 


15 


6 


9*8 


2.Ut 


.86 


5 


12.8 


2.61t -.11 


11 


11.2 


2.23 


.82 


16 


6 


9.3 


1.97 


.1*2 


5 


10.6 


1.11* 


.71 


11 


9.9 


1.70 


.58 


17 


3 


12.0 


1.73 


.1*0 


8 


13.6 


2.88 


.52 


11 


13.2 


2.61t 


.53 


18 


6 


6.8 


3.37 


.90 


5 


8.0 


.67 


.13 


11 


7.1* 


2.50 


.76 


15 


6 


7.7 


lt.03 


.82 


5 


11.8 


1.30 ' 


■*.10 


11 


9.5 


3.67 


.83 


20 


5 


11.1* 


2.30 


.63 


6 


12,0 


1.55 • 


-o5i 


n 


11.7 


1.85 


.13 
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TABLE 3»k 



Suamary Statistics on English Passage Scores 



Statistic 


Native Native 
English French 
N = 22 N = 1* 


Native 
German 
N a 12 


All Subjects 
Combined 
N = 38 


Mean of Passage 
Means 


12.1 11.5 


9.1 


11,2 


Si Di of Passage 
^ans 


2.08 


1.57 


1.77 


Reliability of a 
Single passage as 
a measure of #622 

individual differences 
(adjusted for passage 
differences) 


.2U5 




S» E» of measurement 
for a single passage le08 -*«« 


1.82 


1.58 




TABLE 3.5 






Summary Statistics on French and Geman Passage Scores 


Statistics 


French Passage 


German Passage 




Native Native Combined 
English French 
N = 12 N«li N = 16 


Native 
English 
N = 10 


Native Combined 
German 

N = 12 K « 22 


Mean of Passage 
Mean 


* 

U.8 .. 12.5 


y.o 


11.5 10.3 


S» D* of Passage 
Mean 


2.U * 2.1 


2.5 


2.U 2.3 


Reliability of a 
single passage as a 
measure of Individual 
difference* (adjusted 
for passage differences) 


.1,36 ,365 


.701* 


.51*9 .703 


S* £* of measurement 
for a score on a 
single passage 


1.29 1.1(6 


1.51 


1.27 1.52 






Not computed as K is small 
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Reliabilities of scores on word^cloze passages 

The reliabilities of scores on each one of the cloze passages were 
estimated by the techniques propounded by Ebel (lit )♦ This technique is 
essentially an extension of intra-class correlation and uses the basic 

o4* 4 T 4* 4*V%<a %»a* 1 4 oV\4 ^ 

iSkV/AM4uufec«wa.v/a«w vrX «*«*»««• j JrJUva»wio saw v VAAO^jr \«**/ v*aw a 

of a single observation but also (2) the reliability of a composite score 
from k observations* In obtaining the reliability of scores on the cloze 
passages^ we were faced with the difficulty that in general each subject had 
a score on a different set of cloze passages^ because of the experimental 
design requirements* In £bel*s technique^ one way of handling this situation 
is to ignore the variance due to passages^ and this is justified if one is 
Interested in the reliability of the total scores on a random set of k 
passages* 

In computing reliabilities of single passage scores, it was desired 

to take account of the variance due to passage* This could not be computed 

in the ordinary way because of the difficulty mentioned above, viz*, the fact 

that each subject had a score on a different set of passages, or stated 

differently, the fact that a given passage was not taken by all subjects* 

It was therefore, necessary to estimate the ’'sum of squares” for passages so 

that a proper error term could be determined* For the case of complete data 

2 

it can be shown that the sum of squares for passages is equal to where 

N is the total number of observations for all n persons and k passages, and 
2 

O is the variance of the means of passage scores* It was decided to use 
X 

this relationship as a basis of estimating the mean square of passages* 

Thus Q computed as the variance of 18 passage means* The sum of 

^ 2 
squares for passages (S«S*r)) was then computed as N(T • The sum of squares 

^ X 
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for error was then computed ass 



(S«S*0) * -• (S*S»g) -• (S»S«p) 

where the three terms on the right are, respectively, the sum of squares for 
total, for subjects, and (as estimated) for passages* The number of degrees 

\ 2 ^^/ — \jJJ '^/ — *^0 — ^^9 w «« ««V»M*WWA J^WA MWAAM 



and &> is the average number of passages taken by each subject as computed 
by Ebel^s formula (5}s 




The resulting mean square ts taken as the variance of measurement for a 
score on a single cloze passage. 

Pertinent reliability data are shown in Tables 3«lt and 3*5* The 
reliability coefficients themselves are a function of the raiige of variation 
of subject performance, but the standard erzona of measurement are in theory 
independent of this variation. (In estimating the standard error of meas«* 
urement for a test of double length as needed in Table 5«B, the values were 
multiplied by VT”*) 

It is of interest to note that the characteristic standard error of 
measurement ^ a single score on a cloze passage of 20 items is approximately 



Comparability of word-cloze tests across languages 

It was hoped that the data collected in tnis experiment would yield 
information concerning the comparability of word-cloze scores in the three 
languages being stuaied* Serious problems arise in attempting to assess the 
data for this purpose* Ideally^ large comparable random samples of native 
speakers should have been tested^ comparability being established either by 
selection (common amount of educatiopi for example) or statistically (by 
controlling on some relevant variable which would be identical for all groups^ 
e*go a non-language test of intelligence)* Our data did not even begin to 
approach this ideal^ and we did not have a proper control variable available* 
Two techniques were employed to suggest provisional answers to the 
question* In the first of these^ an analysis of variance was made of the 
mean total scores obtained by the three groups of native speakers— English, 
French^ and German^ each doing word-cloze materials in their own language* 

For each individual, the total score was obtained as the sum of ?diichever 
9 passages out of passages 3 through 20 he happened to have worked on^ if 
because of the randomization that had taken place, he happened to have done 
either 8 or 10 of these passages, the score was adjusted to a basis of 9 
passages* Table 3*6 shows the means and standard deviations of the three 
score arrays, as well as the results of the analysis of variance* The F-ratlo 
not being significant beyond the level, it may be concluded that the word 
cloise materials in the three languages were of comparable difficulty to 
native speakers of those languages* 

The second technique was a comparison by analysis of covariance of the 
second-language cloze-scores of native English and native German speakers, 
using their native-language cloze-scores as a control variable* (The number 
of native speakers of French was so small as to make it undesirable to apply 



TABLE- 3*6 

Analysis of Variance 
of 

Total Scores of Three Native-Language Groups 



Group 


N 


Mean 


S. D. 


Analysis of Variince Results 


EngUsh 


22 


103.9 


27.0 


s| » 11*3*8 


French 


k 


U0.2 


12.0 


* 310.6 


German 


12 


103^3 


13.6 


F 3s s=, 28 Not significant 


Total 


38 


101*4? 


22 U5 





the technique to the comparison of the native English and the native French 
speakers^ total scores used in this analysis were the total scores on 
passages 1 through 20; each total score was based on 10 passages* The 
results are shown in Table 3*7^ ^d it is evident that there was a non«* 
significant difference between the second<«language cloze scores of native 
English and native German speakers* (Actually^ this result shows little 
about the comparability of English and German word-cloze materials^ although 
it indirectly assumes comparability; it is more relevant to the evaluation 
of whether the groups were equally bilingual^ as they appear to have been*} 

Our limited data suggest^ then* that word-cloze scores on English* 
Frenv^h and German passages are cemiparable* at least for native speakers of 
those languages* 

A related question has to do with the consistency of passage difficulty 
across languages or across groups of individuals speaking different laa>» 
guages* These problems cax^ be studied by obtaining rank-order correlations 
between various pairs of passage difficulty values reported in Table 3«1 
to 3.3. 

First* the consistency of the rank-ordering of the passages across 
groups was studied* The rank-order coefficients are in general high (all 
data are for 18 passages ) 2 

English passage difficulties 

fiho 

Between native English and 
native German speakers •^'3 

Between native English and 
native French speakers *7!> 

French passage difficulties 

Between native English and 
native French speakers *7ii 






TABIE 3*7 

Analysis of Covariance of Second-Language 1/Vord-Cloze Scores (Y), 
Controlling on Native-Language Scores (X) 

N = 10 native English + 12 native German =» 22 





Total 


Vdthln 


Between 


Sum of Products 


2785.2727 


29l»7.3SOO 


-162.0773 


Sum of Squares f6r X 


ltU67.0910 


U020.3500 


14*6.71*00 


Sum of Squares for I 


6659.3182 


6799.7700 


59.5510 


Degrees of Freedom 


21 


20 


1 


Correlation Coefficient 


.503 


.S6U 


— 


V 


.6235 


.7331 




isdjttsted^y® 


5122.675n - 


1*960.020 


162.655 


Degrees of Freedom 


20 


19 


1 


Mean Sqviare 


256.131; 


261.05U 


162.655 




162.655 


= .6231 






F is not significant* 




- 


Native Language 


Second Language 


Means 


X 


f 


Adjusted T 


German Subjects 


IO8.33I4 


89.083 


92.037 


English Subjects 


117.300 


85.800 


62.181 
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German passage difficulties 

Rho 

Between native English and 

native German speakers ••••••••«•••••••• 

These results show simply that a passage In a given language tends to retain 

its relative order of difficulty regardless of whether it is applied to 

native speakers of the language or to pez’sons who know the language only as 

a second language* 

The more interesting 4 uestion is whether the passages retain their 
relative difficulty values after translation into another language. The 
following results indicate that large changes may occur? (all data are for 
18 passages) 



Rho 



Between English passages performed by native iinglish and the same 
passages in French, performed by native English *19 

Between English passages performed by native English and the same 

passages in Frfjneh, performed by native French *••.••••*••••••• *21 



Between English passages performed by native English and the same 



passages in German, performed by native English ©33 

Between English passages performed by native English and the same 
passages in German, performed by native Gormans •••••••••••••••• *55 



There is a suggestion here that despite the considerable changes that may 
occur in rank order, (a) consistency is greater between English and German 
than between English and French, and (b) consistency is greater Then the 
passages are in every case performed by native speakers. 



Results for the J^etter-^c l.oze t ests 

The bill^iguals who nere tested with the word-cloze materials were given 
the letter-cloze booklets so that they might do them at home and aretum them* 
They were invited to call the experimenters at any time to ask questions 
about procedure. Although a few enjoyed the task, many found it onerous, and 
the final sample is not as large as might be desired, nor is it evenly dis- 
tributed rmong native . *':;aker3 of the three laiiguages. bevea native Germans 






and five native English speakers took the Eiiglish and German tests; three 
raative French speakers and twelve native English speakers took the English 
and French tests. Thus there were totals of 27 subjects in English, 12 in 
German, and 1$ in French. 

For the score on each page in each language, and for the total score in 
each language, the following statistics are given in Table 3.8; mean, 
standard deviation, corrected split-half reliability (first 25 items vs. 
second 25 items), and standard error of measurement. The scores for each 
page represent the number (out of a possible 50) of correctly guessed 
letters, correctness meaning restoration of the actual letter standing in the 
original text from Viiiich the string was taiceno The data in the table concern 
all cases available for any given test; thus, they include both native and 
acquired-language speakers of the language of the test. 

The mean scores for English pages tend to be slightly lower than those 
obtained by Miller and Friedman, as we can show by converting our means to 
percentages and setting them against the percentages taken from Miller and 
Friedmanns Figure 2 ( 15 )t 



Percentage right 
(letter-cloze tests) 



length of string 


This experiment 


Miller and 


and position 


All Cases 


Only Native English 


Friedman 


deleted * 


(W = 27) 


(N = 17) 


(N «6) 


5a 


39 


Itl 


li7 


5b 


52 


55 


61 


5c 


ho 


k3 


hi 


7a 


52 


51i 


56 


7b 


72 


Ih 


83 


7c 


5U 


57 


6U 


11a 




51 


50 


lib 


90 


52 


9)4 


11c 


Sk 


58 


62 



wa ~ initial; b ~ medial; c = final 




TABLE 3.8 

Results for Letter-Cloze Tests 



Description of . ' « 
Sample 


■ Language and 
Code Designation 
of Page* 


Mean 


S. D* 


Relia- 

bility 


standard Err< 
of 

Measurement 


17 Native 


English 5a 


19.6? 


3.79 


.563 


2 . 1*3 


English; 


English 5b 


26.07 


5.1*7 


.693 


3.03 


3 Native 


English 5c 


20.1i4 


3.82 


.612 


2.33 


French; 




« 








7 Native 


English 7a 


25,85 


3.31 


.379 


2.06 


German 


English 7b 


35'.93 


5.07 


.850 


1.96 




English 7c 


27.00 


1(.50 


.819 


1.91 


27 Total 




e • - 










English 11a 


23..8X 


1(.58 


. 721 * 


2 . 1*0 




EnglD.sh lib 


t5.22 


3.78 


.867 


1.38 




Engi'j.sh 11c 


26.96 


1(.22 


.737 


2.17 




Total English 


250.96 


28.90 






12 Native 


French 5a 


18.66 


2.1(1 


a63 


2.20 


English; 


French 5b 


30.93 


3.«6 


Mk ^ 


2.51* 


3 Native 


French 5c 


20.80 


2.66 


(-. 316 )** 


2.66 


French 












15 Total 


French 7a 


25.13 


3.25 


.669 


1.87 


French 7b 


35.66 


3.21 


♦^30 


2.20 




French 7c 


2li.86 


2.81( 


(-. 137 )** 


2,81* 




French 11a 


26.80 


h.05 


.628 


2.1*7 




French 11b 


1^2.1}7 


3.57 


c702 


1.95 




French 11c 


25-80 


3.56 


.698 


1,96 




Total French 


251.13 


20,68 






5 Native 


German 5a 


17.00 


3.07 


.1(82 


2.18 


English; 


German 5b 


23.33 


5.22 


.688 


2.91 


7 Native 


German 5c 


16.75 


1(.08 


.250 


3.51* 


German 














German 7a 


20.1(2 


1(.67 


.831* 


2.21 


12 Total 


German 7b 


35.00 


7 . 1*3 


=90i* 


2,96 




German 7c 


21.25 


I 1.12 


.796 


1,86 




German 11a 


17.08 


1*.1*6 


.1*81 


3,21 




Gerj-nan lib 


1(1.67 


5.81* 


.925 


1.60 




German 11c 


22.00 


3.26 


.576 


2.12 




Tot^ German 


2li(.5C 


3 7 965 







^ The code designation of the page shows the number of letters in the orig- 
inal text and the position of the deleted letter (a^initialj b=5H9dial5c~final) 
For example, means that the 3rd or middle letter of a seqpience of 5 letters 
vas deleted. 

It* Correlation between halves* 



Our results for native English speakers are slightly closer to Miller and 
Friedman’s, as we might expect. Miller and Friedman do not describe their 
small sample of subjects sufficiently well for us to Judge comparability# 

The reader may be reminded that our English tests were precisely those used 
by Miller and Friedman# 

Nevertheless, the general pattern of the results is similar to yrhat 
Miller and Friedman found. Medial letters are much easier to guess than 
initial or final letters in a string, and the guesses for the longer strings 
are more likely to be correct because of the increased context. 

What is remarkable is the considerable amount of variation in perform- 
ance even among native speakers of the language of the test. Such variation 
of course, is the basis for the rather high split-half reliabilities found 
for most parts of the test as well as for the total scores* The standard 
deviations of part scores are such as to suggest that the range of scores 
covers approximately from 30^ to 70 % of the total possible range, even for 
native speakers. The letter-cloze test seems to measure something besides 
pure language competence; this conclusion is reinforced by the finding that 
the correlations between total scores in German and En^ish and in French 
and English proved to be #897 and *799 respectively. For bilinguals, 
performance in one’s native language is highly related to performance in 
one’s second language* 

Some interest may attach to the relative ease of letter-closse tests in 
the various languages* The t-tests between the means of total scores showed 
that for 15 English-French bilinguals English scores (X ~ 261*9) were higher 
than French scores (X = 25l#3) at the 1^ level of significance, and for 
12 English-German bilinguals, English scores (X = 236*7) were higher than 
German scores (X = 2lU#5) at the , 1 % level of signif?,cance<» While the mean 
difference of the English-French test may be due to the small number of 



native French speakers (only one of the three in the sample scored higher 
in French than in English), there was not a single instance viiere a native 
German speaker scored as high in German as in English# On the basis of 
information theory* a possible reason for the somewhat lower scores in French 
and German is the fact that they use an alphabet larger than the 2? charac- 
ters (including * for space) used in English i in effect, French has a 
35-character alphabet Mth aU varieties of letters with accents and other 
diacritical marks, and German has a 30-chai*acter alphabet* 

It was possible to make direct comparisons between French and German 
scores by an analysis of covariance which controlled the relative ability of 
the subjects on the basis of their English scores. The results are shown 
in Table 3*9* On the whole, there were no pervasive differences between 
French and Geman scores; the few cases where significant differences appeared 
seem to have been accidents of sampling or of test construction. Total 
scores, at any rate, showed no significant differences between the languages 
when controlled on English, either for the whole group or for the native 
English speakers alone. 

Finally, correlations between letter-cloze and word-cloze total scores 
were computed for each group in each language. The results, shown in Table 
3 ©10 indicate clearly that for both groups of bilinguals there are high 
correlations between letter-cloze scores in two languages, but much lower 
correlations between word-cloze scores in the two languages. For English- 
French bilinguals, word-cloze and letter-cloze scores tend to correlate 
substantially, particularly where tl^language is in common. For English- 
German bilinguals, a similar conclusion can be drawn, except for the fact 
that the German word-cloze test tends not to correlate significantly with any 
other score. The data, however, are based on an extremely small sample. 
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TABLE 3c9 

Summary of Analyses of Covariance 
of French and German Letter-Cloze Scores 
Controlling on English Letter-Cloze Scores 
(N = 15 English-French Bilinguals and 12 English-German Bilinguals) 



Page 


F<^>Ratio 


Probability 


5a 


1.09 


P 


i> 




5b 


17.82 


P 


< 


oOOl (French better 
than German) 


5c 


7.70 


P 


> 


•05 


7a 


U.l(6 


P 


> 


o05 


7b 


3.69 


P 


> 


•05 


7c 


6.m 


P 


> 


o05 


11a 


21.23 


P 


< 


•001 (French bejrber 
than German) 


lib 


0.36 


P 


> 


.05 


lie 


2.85 


P 


> 


•05 


Total 


3.3U 


P 


> 


.05 



TABLE 3.10 

Correlations Between Total Word-Close and Total Lotter-Ciose Scores 



K = 


= 1$ English-French Bilinguaj^s 






English Word- Cloze 




1 


2 


3 


it 




1 


1.00 


.50 


.61 


.62 




Ptench Word-Cloze 


2 


.50 


liOO 


.36 


.1^6 




English Letter-Cloze 


3 


•61 


.36 


1.00 


.80 




French Letter-Cloze 


1* 


.62 


•1|6 


.80 


1.00 




Mean 




106.59 


105.59 


Z62<.0h 


251.11 




S* D. 




10.25 


9.62t 


18.91 


20.68 




N 


= 12 


English*’ 


“German Bilinguals 










3. 


2 


3 


k 




English Word-Cloze 


1 


i.oo 


.06 


.68 


.65 




Geruan Word-Cloze 


2 


.06 


1.00 


.06 


.lit 




English Letter-Cloze 


3 


.68 


.06 


1.00 


.89 




German Letter-Cloze 


h 


.65 


.Hi 


.89 


1.00 




Mean 




95.3 


93.7 


236.7 


21it.S 




Se De 




12.8 


26.8 


32.8 


37.6 





One hypothesis that should be further investisated is that letter-cloze 
tests put particular demands on the individual* s spelling ability. A final 
word is in order concerning the reliabilities of the parts. The strings 
with the middle letter deleted tend to ha\ . distinctly higher reliabilities^ 
and if letter-cloze tests are to be used for measuring second language pro- 
ficiency it is advisable to construct them with seven or eleven character 
strings with the middle letter deleted* 

Summary of this chapter 

It is clear that native speakers of a language show considerable varia- 
tion in their ability to restore texts in their native language when the texts 
are mutilated either by a word-deletion or by letter-deletioni Further, 
their ability to restore texts in a second language in which they have near- 
native proficiency, while slightly (and significantly) poorer than their 
ability to restore texts in their native language, is substantially corre- 
lated with the latter ability. This suggests that the ability to restore 
texts is somewhat independent of competence in a language as it is ordinarily 
defined. That is to say, we observe many people who are perfectly competent 
and literate in a language but who do not show facility in the special task 
of guessing what a mifising letter or word in a text might be* If we wish 
to propose **cloze"-technique tests fo measuring proficiency In a second 
language^ it will be necessaty to adjust for the individual* s ability to 
perform **cloze" tests in his native lariguage^ 
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chapter 4 

Special Studies of Characteristics of Cloze Tests 

The results of the tests of bilinguals raised a nv-mber of questions to 
which answers would be desirable if cloze tests were to be used in any 
foreign language achievement testing program* Among these questions were* 

!• what extent is the difficulty val]|ie of a passage sensitive to the 
particular set of words deleted? 

2« To what extent is the performance of examinees on word-deletion 
materials dependent upon cues from the total passage, i*e* cues beyond the 
Immediate context of a deleted word? Ai*e these cues more potent when the 
paragraph is presented in its original form than when it is presented in 
scrambled form? 

3o TOiat kinds of items, from the point of view of syntactical structure, 
are most susceptible to the influence of paragraph cues? 

To what extent is the same ability called for when paragraph cues 
are or are not provided? 

5* TSiiat is the nature of the ability to guess deleted words? To what 
extent is it related to various established ^factors" of cognitive ability? 

Because of the lack of time, fluids, and a plentiful supply of willing 
subjects, it was Impossible to explore these questions as thoroughly as 
might eventually be desired. This chapter reports the results of a limited 
pilot experiment, as well as certain relevant results from a collateral studyo 

Procedure 

Since the concern of this study was with the characteristics of dlo?,e 
tests of native language ability, native speakers of English were used, and 
all materials were in English* 




a* 



Test Biaterials» 



Seven passages (20^ words in length) vjqtq 
twenty \ hich had been used in the study of bilinguals 



selected from among the 
a They were selected 



as follows 5 

n.g gpri wftre to be used as the basis for control variables 

and had to have good reliability. From Table 3.1 it appears 
Uhat the mean scores of all subjects on these passages mre 
13-.8 and 10.9 respectively? they had reliabilities (correlation.: 
with total score) of .83 and .96, respectively. 

Two (6E and 15E) had to be of medium difficulty and good reliabiUty. 
Data from Table 3*1 show means of 9.8 and 10.8, and reliabil- 



ities of 0^7 and c?6:> respectively*. 



They were to be used in 



studying the effect 
Three (lE^ and 



of varying the position of deleuion® 
had to represent the range of difficulty 



used in the previous experiment. Data from Table 3.1 show 
means of 7.0, lli.O, and llt.5, respectively. Their reliabil- 
ities (.87, .80, and .1»1) ifere not considered particularly 
relevant to the problem under study, viz. the effect of para- 
graph cues i?>on the performance on individual items. 

Passages 1.E and 16E were administered to aU subjects without modifi- 



cation* 

For the experimental group, passages 6E and l^E were converted to 
will be designated passages 6K and 15H, by restoring every deletion in 6B 

and 15E and then deleting the n^ 

had the 10th, 20th, ....200th words deleted, the new passages had the 11th, 
21st, V... 201st words deleted instead. (The usual constraints against deleting 




proper names and numerical expressions were observed*) The control group did 
the original passages 6E and l^E. however. 

The experimental group also reo^elved ’'pied" passages which will be 
designated Ipi, ^pi, and 12pi. These were made up from passages IB^ and 
12B by breaking them into 20 discrete items which retained the four words 
immediately preaeding a deletion and the five words which immediately 
succeeded itj t)ie 20 items were then placed on the page in random, order, 
with the restriction that two adjacent sentences in the original would never 
be adjacent in the pied version, A sample page from a pied version is shown 
in Appendix B o It must have been obvious to most subjects that there were 
interconnections between the items in pied versions, but the instructions 
made no mention of this. 

The control group received the origijial passages (IS, ^B, and 12E)* 

All subjects, both experimental ,nd control groups, also received an 
additional page, designated Xpi, consisting of 20 items selected randomly 
from among the remaining 13 passages used in the bilingual study reported in 
the preceding chapter. As in the other pied versions, each item consisted 
of 10 words, the fifth of which was deleted and replaced by a standard-sized 
blank. 

To make a preliminary study of the nature of ability to do cloze passage, 
we administered to all subjects Thurstone^s Four Letter T/ilord test«. This is 
one of the reference tests for the "Speed of Closure (Cs) factor chosen by a 
committee of experts attending a conference on reference tests at the 
Educational Testing Service in 1951 ( 13 )• This test was selected for its 
brevity and presumed factorial pui’ityi the speed of closure factor seemed to 
be the best oae to represent the kind of "closure” which Taylor postuicted to 
be the basis of his "cloze” technique. In Taylor* s words. 

Cloze’ is derived from ’closure,* the term some psychologif?ts use to 
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refer to the notion that humans tend to perceive a familiar pattern 
as a vriiole even when parts of it are missing^ obscured^ or distortedo ’’ 
(28 ) 

In the Tour Letter 1/kord test, subjects are required to locate familiar 

f v(rr\y^f^o ^v\ ^ ^ 

>i^u*wv^\AvbWWl «v«« ww>^vbWA*w wi^ v/X V/ WX09 x'^lV^VUS W 

AMGBWINDTEYKZCIROCKWQEHOILOZNPEBELTO 
idiere the words vdndj rooi.:^ howlj and belt can be found* 

It would have been desirable to try tests of other kinds of closure 
factors, but this was not feasible in this limited pilot experiment, 
be ExperlmentaJ. design and subjects . 

Subjects were Harvard College iipperclassmen and Hai*vard University 
graduate students, all paid for thalr services. Data from the original cloze 
experijaent on bilinguals were included, when applicable, to supplement the 
data for the control group of thi.s experiment. 

Sub;}ects were run either individually or in small gro^qjs. They were 
randomly (in order of appearance) assigned to the experimental group or the 
control group© 

The experimental group (N = 20) received a booklet consisting of 
passages i*E, 16B, 6N, 15N, Ipi, 5pij 12pi, and Xpi© Each subject's booldet 
had the pages in several different orders© Fourteen subjects of the exper-* 
imental group were also given Thurstone’ s Four Letter Word Test© 

The control group (N = 11) received a booklet consisting of passages 
tE, 16E, 6E, 15E, IE, 5E, 12E, and Xpi. That is, except for Xpi, all the 
passages were in their original form. Each subject's booklet had the pages 
in different random order, except that page Xpi was always last. Nine of 
these subjects were given the Four Letter Word Test© 

> 

The cover page of each subject’s booklet presented the instructions for 
the cloze materials; these were virtually identical to the instructions which 
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had been previously used in the bilingual experiment* Since the instructions 
referred to ”the Virhole passage,” one may assume that there was at least some 
tendency to consider even the scrambled passages as somehow integrated* 

No time limit was set for any psirt of the test except the Four Letter 
1/llords test^ which had a time limit of two and a half minutes* 

Scorin g of tests 

Items were scored right only if the word inserted by the s\’fcject was 
exactly the same (apart from minor spelling errors) as the word wliich appeared 
in the text fi*om which the passage was taken* Each page was scored separately; 
the maximum score possible on any page was 20o 

The score on the Foiu* Lo-ater ■'/jord Test was the number of correct four 
letter words the subject succeeded in encircling in the allotted tiio and a 
half minutes* 

effect of alterin g the position of deletio n 
Taylor (27 ) studied two essays 1?^ words in length; these were mutil^ 
ated by deleting every fifth word (35 in al3.) staz'ting from five different 
initial locations to produce five different deletion versions* A significant 
overall difference was found in cloze scores attained by different fifths of 
a sample of 28? subjects* The particular words which are deleted in a 
passage therefore can be e::qpected to make a difference in the overall cloze 
score foi the passage* This is because individual deletions vary consider-^ 
ably in difficulty* In order to evaluate a passage accurately it is necessary 
to have a relatively large number of deletions* Taylor ( 27 ) has stated his 
opinion that values stabilize satisfactorily only when there are at least 50 
deletions* 

In the present experiment, it was possib.Ie to compare cloze scores from 
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two deletion versions of each of two passagesc Comparison was made by an 
analysis of covariance of the close scores f:?om two groups (an experimental, 
and a control group )p holding ability to do cloze items constant by using 
scores from a third passage vrtiich both groups did in common* In one case 5 
there was no significant diffe: ence between deletion versions, while in the 
other case, the difference produced an F~ratio which was significant at the 
<►001 level* Our results tend therefore to confirm Taylor’s; different 
deletion versions can and ^ sometimes produce significantly different scores 
for the passages^ This conclusion holds for cloze passages in which there 
are 20 deletions occurring as every 10th word in a passage 20p vords long. 

The statistical comparisons on vdiich the above statement is based are 
from (1) a comparison of scores on passages 6E and 6N holding cloze ability 
constant by means of scores on passage 16E and (2) a comparison of scores on 
passages and 1$N holding cloze ability constant by means of scores on 
passage liE* In the first of these comparisons, data were available for 20 
subjects in each group; in the second, there were 2k sub^-^cts in the control 
group and 20 in the experimental* 

The effect of paragraph cues on c3,oge scores 

These effects could be studied at two levels > (1) the effect of the 
par agraph in its original order vs« the effect of the paragraph when ten*- 
word segments of it are scrambled or ’’pied” and (2) the effect of pied 
paragraph cues vs* the absence of such cues, as when ten-word cloze items 
are taken randomly from unrelated paragraphs* 

The first level of effect was studied by means of three analyses of 
covariance. That is, scores on Ipi were compared with those on IB, scores 
on 5pi with those on and scores on 12pi were compared with those on 12E© 
For the first two of these comparisons, scores on passage ijB were used as 
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as controls; for the last^ scores on passage 16E were used* It was necessary 
to use two different control tests because not all the adult bilingual sub*« 
Jects, who were used to augment the size of the control group,, had taken 
both tests# 

Significant (P<o001) effects of scrambling the items of a paragraph 
were found in all three instances* The adjusted means of the original and 
scrambled passage scores are as follows: 





Passage 1 


Passage 5 


Passage 12 


Original 


7*37 


14.66 


14.71 


Scrambled 


4c 21 


llo77 


10.99 


Difference 


3.16 


2,89 


3.72 



The design of the present stuc^ did not yield a precise test of the 
second kind of effect— ‘that of having cues from the same paragraph, even when 
scrambled, as compared with hairing no cues* This would have to be done by 
an item-by‘-item comparison of items set in scramb.led paragraphs with the 
same items assembled from unrelated materials, as in passage Xpi« Neverthe- 
less, it was possible to compare all 60 available scores of the 20 subjects 
on the 3 scrambled passages (regarded as representative of the passages from 
which the items of the pied tests were drawn) with the scores of 30 subjects 
on the pied test Xpi» The former scores had X = 9*2, s = 3*8; the latter 
had X = 7.6, s = 2.1. A one-tailed t-test for uncorrelated meaiiS yielded 
t = 2»3, P< o025j this is a conservative test because some of the cases 
underlying the two means were identical and there is reason to expect con- 
siderable positive correlation between the scores. At any rate, cues from 
paragraph context seem to be influential even vdien the paragraphs are 
scrambled. 

It was thought that paragraph organisation cues might become more 
forceful toward the end of a cloze passage; if so, the mean scores would be 



higher toward the end of the passage* This seemed possible despite the fact 
that subjects were instructed to examine the entire paragraph before starting 
to work on it, because it was observed during testing that subjects tended 
to work from the beginning of the paragi*aph to the endp frequently without 
returning to the beginning or to earlier itemii^ 

This possibility was investigated by examining the difference in scoi^es 
on the first 8 and the last 8 items in passages 1^ 5^ and 12 and comparing 
these differences under the non-scrarabled and the scrambled conditions. 
According to the experimental hypothesis, the difference (score on the last 
8 items minus the score on the first 8 items) should be larger under the 
non-scrambled condition (obtained in the control group) than under the 
scrambled condition (obtained in the experimental group )« It was found, 
however, that the mean of the first 8 items was not regularly smaller than 
the mean of the last 3 items and that the differences between the mean 
differences never approached statistical significance# Thus, the notion 
that paragraph cues act cumulatively is not confirmed by these data* It is 
conceivable that paragraph cues sometimes work negatively, i.e. a paragraph 
cue could ” throw off” the subject* 

Types of items aided by paragraph cues 

It was demonstrated above that context of paragraph length (as con^ared 
with a context of only 9 words) is a significant factor in permitting the 
subject to guess the word that stood in a particular position* Such a resul,t 
agrees with theorys the more context, the more information, the better the 
guess. However, it is unlikely that context operates by the sheer weight of 
information; it is more likely that some words can be guessed through 
immediate contexts, other through more remote contexts. 



Data from 22 ccitrol subjects (some from the bilingual experiment) and 
20 e:q}eriiiiental subjects were arranged so that the proportionve of subjects 
getting an item correct when it was part of a continuous text could be com- 
pared with the proportion of subjects getting the same item correct when it 
was part of a scrambled text* Items showing a ‘'marked” difference (20 or 
more percentage points) between the two conditions were identified* Of the 
60 items in all, 26 showed a difference of this extent, but three of these 
were in the unanticipated direction, i.e* they were mere often guessed 
correctly in the scrambled version. The words were c3.assified according to 
conventional parts of speech* The follovring statements concerning the sus- 
ceptibility of the conventional parts of speech to the influence of paragraph 
cues can be made only tentatively, since an most cases the samples are too 
limited t 

Prepositions (including ;to when it occurs as a part of the infinitive) 
are highly stable. Only 3 out of 13 of these shov/ed marked decrease in 
percentage under the scrambled condition. 

Nouns show typica3.1y marked decrease in percentage guessed under 
scrambled conditions s 6 out of 13 showed this change© 

Adjectives are relatively stable; oii3.y 1 cut of 8 decreased 20jH under 
scrambling. 

Verbs are highly influenced by paragraph cuest 6 out of 8 showed 20^ or 
more decrease in correct guesses under scrambling. The only verbs ncfl!» 
showing this influence were the two auxiliary forms could and should © 

There were too few examples of other form-classes to make any useful 
statements about them. 

The results are consistent with the expectation that might well have 
been stated in advance? that “content-bearing" words such as nouns and verbs 
are more likely to be influenced by paragraph cues than words which are more 
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concerned "with the syntactical skeleton, so to speak, of the messageo This 
result also suggests that when cloze techniclue involves connected discourse 
of paragraph length (as opposed to short ser'tence segments like the 10-word 
segments used here), the scores are more likely to be related to the exam- 
inee’s ability to comprehend the total meaning of the paragraph, and thus 
perhaps to his general intelligence or verbal ability. ^Ve may hypothesize > 
therefore, that in order to achieve some suppression of the intellectual or 
verbal variance in cloze scores, paragraph-length materials should be 
avoided. 

If this result had been discovered earlier in the project, more use 
would have been made, in subsequent experiments, of word-deletion tests 



utilizing short, 10-word segments. 

The relative difficulty of items did not change radically as a result 
of scrambling. This was ascertain-ed by determining the correlation coeffi- 
cients between the array of difficu3.ty values for the items in the connected 
passages and the array of difficulty values fvr the items in the scrambled 
passages# For pages IE and Ipi, r = .66; for pages and ** * 
and for pages 12E and 12pi, r - 068# 



Comparative reliability of continuous and scrambled pa ssages 



Ebel’s technique for the reliability of multiple measurements (1^) 

t' 

was applied to the scores of the 20 subjects on the continuous passages 
I and the 3 ’’scrambled” passages. The reliabilities were for the total 



: score on the h continuous passages and .72 for the total score on the three 

scrambled passages (or #77 when adjusted to be comparable with the figure 
: given for h continuous passages). If continuous passages are truly less 

I reliable than scrambled passages, as this result suggests, the explanation 

t may be that the scores on continuous passages are influenced by the accidents 

I 

i of total meaning context. 



o 



The nature of the ability r equired to cloze items 

The results obtained vdth the Thurstone ^our letter 1/liord test may be 
quickly summarized o There were no correlations between this test and any of 
the scores on cloze passages which were significantly different from zero® 

It appears, therefore, that the ability to do cloze items is not related to 
the speed of closure factor which is reportedly measured by the Four Letter 
Viford test* 

Fortunately, relevant data are available through the courtesy of 
F.; D# Weinfeld . from a completely unrelated study (32 ) being conducted 
concurrently in the Laboratoiy for Research in Instruction* In this study, 
cloze passage li|. was included as a word-deletion test in a battery of 28 
tests administered to groups of children in grade 9, 10, and 11* It was 
scored in the same manner as it v/as in the present study* Table shows 
the correlations of -ds test with the 2? other tests in the battery, for 
190 boys, and girls, and for the total sample of 3Ui; children* The tests 
are grouped on the basis of the factors disclosed in lllieinfeld* s study* Space 
does not permit a complete description of the tests^ many of them come from 
the Educational Testing Service kit of reference tests. Those identified as 
being by Taylor are from Calvin Taylor’s recent study of variables in commu- 
nications situations ( 2.5 )• Instead of determining the average correlations 
of the cloze test with the tests loaded on each factor, indices of association 
were computed by treating the vector of correlations as if it were a column 
in the factor matrix in Weinfeld’ s study, normalizing all vectors, and then 
computing the inner products with the cloze test vector* The resulting 
indices are included in Table U*1 . 

It is immediately apparent that the cloze test scores were rather 
highly correlated with a variety of well-known cognitive factors* The close 
test correlates most highly with reasoning, verbal, theme writing, and 
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TABLE lul 

Factor Loadings of Tests and Correlations of Cloze Test with Other Tests 
Used in Weinfeld*o Battery (Grouped According to Factors)^ 
together with Indices of Association‘s (A) with Each Factor 



Test Name and Number Author Factor Correlation with Cloze Test 

Loading Boys Girls Total 
on its N -vL^O N— l5ii N=3itli 
Factor 



Reasoning Factor s A = *760 

(10) 



Sentence Order 
Pedigrees (?) 
Letter Series (ii) 
Reasoning (12) 



Adkins 

Thurstons 

Thurstone 

Thurstone 



Verbal Factor ? A » .697 

Verbal Analogies (26) Thurstone 

Disarranged Sentences (2?) Thin.’stono 

Completion (2U) Thurstone 

fiords in Sentences (25) Car-roll 

Themo YJritin g (Themes rated on k scales)? 

Sentence Structure (22) 

Originality (20) 

Choice of Words (21) 

Organization (23) 

Fluency of Expression? A = «5U0 



Inventive Opposites (5) 
Word-Group Naming (li+) 
Word Association (3) 
Telegram Writing II (13) 



Thurstone 

Ad'cins 

Guilford 

Taylor 





o523 


.515 


.537 


.513 


cU96 


.1*07 


.1*69 


.518 




.,310 


,1*12 


.1*50 


e376 


.260 


.309 


.527 


.?53 


.1*93 


.sia 


.575 




.1*12 


.1*70 


.51*6 




.381* 


.1*68 


,i*52 


.k22 


.315 


.1*03 


scales): 


A = 


.591 




.752 


•509 


.278 


.1*39 


.6?? 


•U67 


.281* 


.1*18 


.800 




.271* 


.1*05 


.783 


.161 


.220 


.391* 


.339 




.523 


.550 


.355 


.505 


.1*01 


.1*70 


.21*9 


.330 


.21*1 


.318 


-.331 


-.293 


-.175 


-.279 



* The indices of association, Ap, were obtained by Computing , after norm- 
alization, the inner product of each vector in the factor matrix and the 
vector of correlation of the word-cloze scores with the tests in the factor 
matrix, that is, for the £th factor. 



An = 



where ain 



>1 



do 



/Tt. 2 

V , 

•iihe factor loading of test j on factor p, and 
the correlation of test j with the word-cloze score* 
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TABLE h* 1 
(continued) 



Word Fluency :r A = «ii 

Four-Letter Words 

(Production) (18) 

Suffixes (16) 

First and Last letters {9, 

Ideational Fluency ? A = thOO 

Distorted English (8) 

Letter-Star Test (2) 

Sentence Fluency (1) 

Multiple Completion 
Sentences (6) 

Topics (11) 
not Titles (19) 

Similies III (17) 

Thing Categories (15) 





Loading 
on its 
Factor 


Boys 


Girls 


Total 


Thurstons 

Thurstons 

Thurstons 


.166 

.370 

.573 


.385 

.380 

.271 


.232 

.21*3 

.228 


.31*8 

.31*5 

.272 


= .UOO 










Carroll 

Carroll 

Taylor 


.296 

.393 

.U72 


.385 

.333 

.281 


.!<12 

.259 

.137^ 


.1(28 

.331 

.231 


Johnson 

Taylor 

Guilford 

Taylor 

Taylor 


.10»3 

.625 

.651i 

.722 

.U8X 


.250 

.217 

.210 

.059 

.057 


.U*2 

.u*lt 

,C06 

.052 

.051* 


.225 

,191* 

.152 

.062 

.Ola 
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expressive fluency factors as identified in TWeinfeld*s sttidj.'’# The cloze 
test is less associated with word fluency and ideational fluency factors^ 
even though one might have expected a considerable association because of 



the apparent similarity of the tasks* 
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of cognitive ability when the testing is in the subject *s native language 
raises grave question as to the potential efficacy of the cloze procedure as 
a measure of the subject* s achievement in a foreign language* 
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i CHAPIE3 5 

[ Try-out ©f Tests ift Secondery* School Foreign Language Oc-assea 

V 



The last major* step of the present study was a tryout of the written 
I cloze tests in a number of secondary schools, both public and private* «'ie 

t 

r: 

i sought to find out something about the characteristics of such cloze tests 



as measures of foreign language proficiency, as well as to see how practicable 



j these tests would be in a secondary school setting. 

' Cooperation was secured from $ secondary schools in the region surround-* 

ing Boston: of these, 2 were public h5.gh schools and 3 were private schools* 

I ^ 

t jjj every case, the groups contained^ at least a few students idio had just 

taken, or were about to take, the College Board language examination in the 
^ language of their choice . Students of French were usu al ly in their -l^th, 

[ -or 5th year of instruction, but the students of German were all in either 

the 2nd or 3rd year of instruction. The testing was done in late April and 
early May, 1958 j this was at a time when many of the students had just taken 
the College Board examinations. 



Cloze tests s elected for the high school stud^ 

The limitations of the testing time likely to be available made it 
necessary to use only a small selection of the cloze materials which had been 
developed and tried out with adults. Two passages were selected from ■Uis 
word-cloze materials and two tests from the letter-cloze materials. For 
French, these were passages 3 and 10 for the word-cloze and tests Ub and 7b 
for the letter-cloze tests. For German, passages 3 and Ik of the word-cloze 
materials, and pages lib and 8b of the letter-cloze materials were used. 

These selections represented an attempt to offer high-school students highly 
reliable materials of not too high a level of difficulty. 
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Test administration procedures 



In four of the five schools, the cloze tests were administered by class- 
room teachez's according to Witten instructions which had been furnished to 
them; in the fifth school, the cloze tests were administered by project 
personnel# 

The tests themselves consisted of 6-page mimeographed booklets# Page 1 

contained a short (questionnaire soliciting data on (1) how long the student 

had studied the foreign language; (2) whether he had studied any other 

foreign language; ard (3) whether a foreign language was spoken in his home# 

(Except for the data received as to how long the language tested had been 

studied, little or no use has been found in this study for the information 

on this questionnaire# There -were only sporadic cases in which foreign 

languages were spoken in the home#) (See Appendices C and D) 

Also on page 1 were the instructions for the word-cloze tests which 

were to be found on pages 2 and 3* The instructions were as follows s 

Part I of this test is made up of two [F rench 3or [German] passages, each 
from a different book# Every tenth word has been replaced by a blank# Your 
task is to read through the passage to see what it is about, and then try to 
fill in each blank with a (S’renchierCGermanJ word that makes sense# The 
blanks are all the same length, but remember that the words vjhich have been 
taken out may be short or long;# As some blanks will be easier to fill than 
others, it is a ^ood. idea to c*.o the easier ones first, and then go back to 
work on the harder ones, if you have time-> 

Do one page at a time# IVhen you finish one page, wait until your 
instructor tells you to go to the next# Do not look at the first page until 
your instructor tells you to begin# 

Work rapidly, because you have only nine minutes for each passage! 

The nine-minute time limit for each word-cloze passage (20 completions) 

was chosen in the light of experience with adults. Even for the high-school 

students, it was long enough to permit most students to get to the end of the 

passage. It may be noted that the instructions asked the students to fill 

each blank with a word "that makes sense”; there was no suggestion that the 

student actually try to guess what word actually stood in the original text. 







It is probable, however, that the precise wording of the instructions made 
little difference# The instructions were developed after fairly extensive 
tryouts with volunteer subjects obtained around the laboratory* 

The two word-cloze tests followed as pages 2 and 3 of the booklets} they 
were separately timed* 

Page Oontained the instructions for Fart II, the letter-cloze 

materials* The instructions for German, for example, were as followss 

On the pages that follow are words and parts of words. The first page 
has examples 11 letters long; the second one has examples 7 letters long* 

An example does not necessarily begin at the beginning of a word or end at 
the end of a word, and the examples are completely unrelated to each other* 
Here the middle letter has been replaced by a blankji try to put a 
letter in the blank that makes sense. Besides the ordinary letters of the 
German alphabet, there is one other symbol you may uset which stands for 
the space between words. 



Sample Problems 


Answers 


JEDE*J)CHE* 


iIEDEkVOCHE* 


ACHSTJ*KAM 


ACHSia«KUl 


INBSAbJSPOTT 




#ES_KL0 


*E^KLO 






HEU_E*U 





Do not look at the next page until your instructor tells you to begin, 
"When you finish one page, wait until you are told you may go on to the next* 

Sach page of the letter-cloze materials had listed across the top all 

the symbols (including variants with the several diacritical marks) which 

could be used*- These arrays of symbols were intended to stimulate the 

student to indicate the exact letter and diacritical mark which he proposed 

V 

to substitute for each blankr No special comment was made about diacritical 

4 

marks, however* 

The total cloze-material tests required nearly all of the typical class 
period* The schedule prescribed was as follows » 



-70- 



2 minutes: 
2 minutes: 

9 minutes* 
9 minutes* 
2 minutes* 

6 minutes: 
6 minutes: 



filling out information required on page one, 
reading instructions to Part I and 
answering ary questions « 
first word-cloze passage<> 
second word-cloze passage* 
reading instructions for Part II and 
answering any questions* 
first letter-cloze page® 
second letter-cloze page* 



36 minutes (total) 



Collateral data secured 

(1) The Psi-Lambda Foreign Language Aptitude Test (Carroll and Sapon) 

In some schools, it was possible oo secure additional testing time in 

order to give the foreign language aptitude test developed by Carroll and 
Sapon ( 6)<» Use was made of the preliminaiy form identified as Form P, 
Psi-Lambda Foreign Language Aptitude Battery* Only parts and 5 of the 

test were given in order to conserve testing time* The authors supplied 
data permitting the establishment of T«scores^ derived from a weighted 
combination of the scores on these parts* This test was given in order to 
afford a basis for matching the groups tested in terms of a variable related 
to success in language learning* 

In all cases, this test was administered in person by representatives 
of the project* 

(2) Results of CEBB language examinations* 

Eighty-two of the 257 secondary school students tested in this experi-^ 
ment took the CEEB language examinations in French or German in the spring 
of 1958* ucores on these examinations were furnished either by the schools 



^ The T-scores had been scaled by the test authors in such a way that they 
had a mean of 50 and a standard deviation of 10 in a standardization sample 
of 912 cases, including about 11^ high school students, 26^ college students ■ 
and 63^ adults (enlisted men in the U* S. Air Force, etc*) 







theaiselves 6t tltfough the courtesy of the CEEB, New York# These scores 
utilize the College Board s6ale with mean of *^00 and s# d. of 100» 

O) Teachers* grades# 

These were the course grades given by the teacher for the term ending 
in June 1958s the scale in which they are given varies from school to schooii 
some schools using letter grades, others using percentage figo.res» 

(1() Academic average# 

In some schools, data were obtained on the general academic standing of 
the pupils involved in the experiment# 

{5) Intelligence measures# 

A variety of measures of intelligence were available, all from school 
records# These included Scholastic Aptitude Test scc 2 *es and IQ*s derived 
from the Otis intelligence test^;. 

Scores on the cloze tests 

For the data summarized in this chapter, it may be assumed (except where 
otherwise noted) that the scores on the cloze tests were obtained by counting 
the number of completions which exactly corresponded to the words or letters 
which stood in the original texts# Both part and total scores were obtained 
for the materials# 

The level of foreign language proficiency of high«»5Chool students 

In dealing with any set of measurement data we are inclined to look 
first at measures of central tendency# In the present case we have a large 
number of small groups at various stages of their formal instruction in a 
foreign language# We may first askt Tilihat level Of proficiency do the testa 
show at various stages of instruction? Secondly: Do the tests differentiate 
among groups at various stages of instruction? 

Table ^#1 shows the means and standard deviations of the several part 
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and total scores of the French cloze materials for a number of the separata 
groups. Comparative data from the adult testing described in Chapter 3 
also presented® Table 5.2 is a similar table of data for the German cloze 
tests. 

In order to assist in the interpretation of the levels of proficiency 
indicated by the means. Tables 5.3 and have been prepared. On the 
assumption that cloze scores measure on a ratio scale having an absolute 
zero, and on the further assumption that the performance of adult bilunguals 
represents an anchor point with which the performance of learning may be 
compared, the means for the various high school groups have been ejQ)ressed 
as proportional parts of the means for adult bilinguals on the same tests. 
Because of the small number of cases available, the data for adult bilinguals 
are not as reliable as might be desired, although they are at about the same 
level in the two languages. In the case of the French data, however, there 
is an interesting proportionality between the results for the word-cloze 
and the letter-cloze tests. That is to say, the relative level of the mean 
word-cloze score of any group is very similar to the level of the correspond** 
ing mean for the letter-cloze test. For example, for all students in 
third-year French, th^> mean word-cloze score is of adult performance, 

irtiile the mean letter-cloze score is ^6.0^ of adult performance. Correspond- 
ing figures for students in fourth-year French are 58 « 7^ and 58 oO^, respect- 
ively. It is at least intuitively reasonable to think that the achievement 
of fourth-year students of French is about 6055 of the asymptote of learning. 
On the other hand, the measured achievement of third-year students, at 52.9^ 
and 56.0^ of adult performance on the word-cloze and letter-cloze tests, 
respectively, seems higher than what one would expect. More information 
would be needed to interpret these results— e.g. information on the aptitude 
of the students and the nature of the instructional content. 
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Means of French Cloze Tests for High-School Students 




This is an “accelerated” class* 
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TABLS 5.1i 



-Means of German Cloae Tests for High-Scho6l Students | 
Rypreased as Proportional Parts of Means for Adult Bilinguals^ 

By Year of Language Instruction 









Word- 


German 
•Cloze Test 


Letter- 


German 
-Cloze Test 


Year 


School 


N 


Mean 


P.P. 


Mean 


P.P. 


t 


C 


9 


11#3 


.383 


60.8 


.791 




E 


2li 


$.7 


.193 


1(1.1* 


.539 




Total 


33 


7.2 

1 


■ V2W^ 


~WT 


.607 


3 


C 


9 


Ul.2 


.m 


61*.l* 


.637 




E 




9.U 


CO 

i-l 

• 


lil*.l* 


.578 




Total 


30 


10.2 


.31*9 


'l;7.8 


.623 


Adult Bilinguals 

Approx# 

12 


29.5 


IcOOO 


76.7 


1.000 



9 

% 

f 

♦ 

i 
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The results for German (Table 5*i;) do not show the oleai^ proportionality 
which was evident for the French results. The means for the word-cloze tests 
are much lower than one would expect on the basis of the French results, and 
the means for the letter-cloze results are a little higher (relative to 
adult bilingual performance) than one would expect. These results may 
possibily indicate that different aspects of German are learned at different 
rates: those aspects relevant to a word-cloze test are learned relatively 
slowly, but those aspects relevant to a letter-cloze test are learned 
relatively fast. What these aspects may be is a matter for speculation and 
further research. 

Both for the French and the German results displayed in Tables 5.1 and 
5.2, it is disappointing that there is relatively little differentiation 
among groups in different years of language instruction. Even when we 
consider the year-groups in ary indiridual school, there is not any dramatic 
progress shown from year to year. It has not even seemed worthwhile to make 
ary precise tests of the statistical significance of the trends; many of the 
trends would undoubtedly be only of marginal significance, and even the 
trends which one would judge to be significant beyond the— say— level are 
still not strong enough to justify using the cloze tests as an indicator of 
amount of instruction* 

Intercorrelational results 

For the French cloze tests, the data were sufficiently voluminous ill 
Schools C (a private boys* school) and D (a public high school enrolling 
both sexes) to justify the computation of intercorrelation matrices, shown 
in Tables and $.6. Table 5*7 shows intercorrelation matrices for School 
E (another public high school) where the German cloze tests were given. 

These data are the basis for the subsequent remarks concerning the reliability 
and validity of cloze tests* 
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table 5.5 



Intercorrelation Matrix for Cloze Tests, Language Aptitude Test, 
Grades in French, and GESB i’rench Score 

School C (Private Boy’s School) 

N = 26 Year 3 
= 37 Year h 
=5 23 Year 5 

N = 66 TotaT* 

i ilc ^ 



Yr., 


W-C T 
1 


Ir-C T 
2 


3 


h 


5 


6 


US 

7 


GIF 

8 


Mean 

9 


S.C. 


Word-Cloze 


1 


3 


1.00 


.30 


.77 


«80 


.30 


"^2T 


.56 


p63 


li .6. 


3.6 


Total 




i» 


1.00 


.19 


.80 


.79 


.17 


.17 


-.10 


*k2 




3.0 






5 


1.00 


.52 


.89 


.89 


.21 


.61 


.51 


♦72 


16.1 


5ol 






T 


1.00 


.35 


c82 


. 81 ; 


.25 


.36 


.k3 


.57 


1$.0 


3.9 


Letter- Cloze 


2 


3 


.30 


1.00 


.10 


.37 


.86 


.90 


.i;5 


.1»3 


1(0.2 U.5 


Total 




h 


.19 


1.00 


.12 


.19 1 


.eii 


.91 


.02 


.16 


I 1 U.I U.9 






5 


.52 


liOO 


.39 


.55 


.78 


.83 


.65 


.37 


1(9.6 10.6 






m 1 
X 


.35 


1.00 


.20 


.38 


.Qh 


.90 


.37 


.37 


1(1(.1( 12.0 


Word-Cloze 


3 


3 


.77 


.10 


IcOO 


.23 


. 3 J 4 


.05 


.k8 


.1|9 


8.9 


2.5 


Test 3 




k 


.80 


.12 


1.00 


.27 


.18 


.06 


-.11 


•06 


9.2 


1.9 






3 


.89 


.39 


1.00 


.59 


.07 


.53 


.39 


.6it 


9.3 


2.9 
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.82 


.20 


1.00 


.36 


.13 


.21 


•26 


.35 


9.1 


2.3 


Word-Cloze 


k 


3 


.80 


.37 


.23 


1.00 


.33 


.32 


.ko 


♦1^9 


5.7 


2.1 


Test 10 




h 


.79 


.19 


.27 


1.00 


.09 


.22 


.27 


.62 


5.1( 


1.8 






5 


.89 


.55 


.59 


1.00 


.31 


.56 


.53 


.63 


6.9 


2.8 






T 


.8 U 


.38 


.38 


IcOO 


.27 


.38 


.1;5 


.60 


5.9 


2.h, 


etter-Cloze 


5 


3 


.30 


.86 


.Hi 


.33 


1.00 


*56 


.58 


.31* 


19 .U 


6.0 


Test 7b 




i; 1 


.17 


.81^ 


.18 


.09 


1.00 


. 5 I 1 


.06 


•06 


21.0 


5.9 




5 


.21. 


.78 


.07 


.31 


i.OO 


.30 


.kl 


.2h 


23.5 


6.1 






T 


.25 


.81^ 


.13 


.28 


1.00 


.51 


.37 


.28 


21.2 


6.2 


■ etter-Cloze 


6 


3 


~ .2U 


.90 


.05 


.32 


.56 


1.00 


.25 


*h2 I 


20.8 


7.0 


Test 11b 






.17 


.91 


.06 


.22 


.51t 


1.00 


-.02 


♦20 1 


I 23 .I 


7.7 








.61 


.83 


.53 


.56 


.30 


1.00 


.63 


.35 i 


E26.0 


7.0 






T 


.36 


.90 


.21 


.38 


.51 


1.00 


.28 


•36 

1 


I 23.2 


7.6 


Tjangn Aptitude 


7 


3 


.56 


.1^5 


.it8 


.itO 


.58 


.25 


1.00 


.67 


Si(.5 


7.2 


Test 




1; 


.10 


.02 


-.11 


.27 


.06 


-.02 


1.00 


.25 


' 52.8 


5.2 






.51 


.65 




.53 


.I 4 I 


*63 


1.00 


.57 


j58,l( 


6.2; 


/ 

Grade in 




T 


.1^3 


.37 


.26 


.hs 


.37 


.28 


1.00 


.58 


|51(.8 

i 

1 


6.6 


8 


3 


.63 


.h3 


.h9 


.k9 


.3k 


.h2 


.67 


1.00 


1 , 

71*. 2 


7.5 


French 






.h2 


.16 


.06 


»62 


.06 


.20 


.25 


1.00 


73.2 


6.2 




5 


.72 


.37 


.64 


.63 


*2h 


.35 


.57 


l.CO 


82.2 


6.1 






T 


.57 


.37 


.35 


.60 


.28 


.36 


.58 


1.00 


75.9 


7.6 


CEEB (K = 2 l |) 






1 .7li 


.3U 


.53 


.79 


•08 


.h6 


.52 


.80 


^ la .8 89.0 
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TABLE 5.6 



Intercorrelation Matrix of Cloze Tests, IQ, Grades in French 

and CEEB Language Scores 



School D (Fubiic High School)' 

N = 36 Year 3 
= il Year ii 















W-G 


1 

1 InO 


1 

{ 4 






.b 


Variable 








W-C T 


L-C T 






% 




• IQ 


GIB* 


Mean 


S.D. 










1 


2 


3 


i» 


5 


6 


7 
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Ir. 


( 




















Word-Cloze 




1 


#% 


io 6 o 




.84 




.30 


.51; 


♦ 1 ^ 


.w 


"HuT 


3.6 


Total 






li 


1.00 


.39 


.31; 


.80 


.27 


.39 


.36 


.33 


15.9 


2.9 








T 


1.00 


.la 


.85 


.82 


.26 


•li; 


.26 


JlO) 


15.3 


3.3 


Letter-Cloze 


2 


3 


.it8 


1.00 


.30 


•51 


.87 


CVJ 

ON 

• 


.39 


.18 


1(6.2 


13.0 


Total 






k 


•39 


1.00 


.37 


.28 , 


.61; 


.90 


.1;9 


.16 


lik.O 


11.7 








T 


.la 


1.00 


.30 


.1;0 


.85 


.91 


.10; 


.17 


1(5.1 12.1( 
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3 


3 


.8U 


.30 


1.00 


•1;0 


.11 


•1;0 


.20 


.1;9 


9.0 


2.1 


Test 3 






k 


.81; 


.37 


1,00 


.36 


.21; 


.38 


.38 


.25 


10.1 










T 


.85 


.30 


loOO 


.39 
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.36 


.28 


.38 


9.6 


2 a 


/ord-Cloze 




h 


3 


.81; 


.51 


.1;0 


1.00 


.39 


.50 


.11 


.29 


5.5 
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Test 10 






h 
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.28 


.36 


1.00 


.21 


.27 


.18 


.31 


5.8 


1.7 
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.82 


.1;0 


.39 


loOO 


.30 


.39 


.11; 


.30 


5.7 


1.9 


Letter-Cloze 


5 


3 


.30 


.87 


.11 


.39 


IvOO 


.60 


.J1 


.06 


23.7 


6*5 


Test 7b 
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.27 


.81; 


.21; 


.21 


1.00 


.51 


.hS 


ax 


23o7 


5»9 
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.26 


.85 


.11; 


.30 


1.00 


.56 


.38 


.08 


23.2 


6.2 


Letter-Cloze 


6 


3 


.51; 


.92 


.1;0 


.50 


.60 


1.00 


CO 


.25 


22.5 


8.0 


Test Ub 
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.39 


.90 


.38 


.27 


.51 


1.00 


.la 


.16 


21.3 


7.5 








T 


.kk 


.91 


.36 


.39 


.56 


1.00 


.39 


.20 


21.9 


7^8 


IQ 




7 


3 


.19 


.39 


.20 


.11 


.31 


.38 


1.00 


.22 


117.0 


7.3 
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.36 


‘49 


.38 


.18 


.1;5 


.la 


loOO 


.26 


116.8 


8.8 








T 


.26 


.10; 


.28 


.11; 


.33 


.39 


1.00 


.21; 


116.9 


8.1 


Grade In 




8 


3 


.1;7 


.18 


.1;9 


.29 


.06 


.25 


.22 


1.00 


52.9 


21.1 
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h 


.33 


.16 


.25 


.31 


.11 


.16 


.26 


1.00 


5i;.9 


20.9 
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.1;0 


.17 


.38 


.30 


.08 


.20 


.21; 
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3 


.59 


.26 


.111 


.61 


.17 


.28 


.21; 


.63 
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.38 


.03 
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.18 


.36 


.37 


531.5 58.: 7 
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.21 
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TABLE ?•? 



Int^r correlation Matrix for Cloze Tests, Language Aptitude Test., 

IQ, and Grades in German 



School E (Public High School) 



K = 2k, Year 2 
s 2$, Year 3 



Tot* Tot 






Variable 






W-C T 


L-C ' 
2 


J, w-u 


1/-U 


US 

7 


IQ 

8. 


(EG 

9 


X 






1 


3 if. 


5 6 


TSbrd-Cloze 


1 


2 


loOO 


*37 


'.8'5 "sT 


.27 .1(2 




•5'0 


•79 


?.7 


3.9 


Total 




3 


1.00 


.70 


.93 «93 


.55 .67 


»Y1 


.61 


^65 


9.k 


5.1; 


Letter-Cloze 


2 


2 


.37 1.00 


.29 .3!t 


.93 .96 


0 C 8 


•61 


.11 


kl.k 


13.9 


Total 




3 


.70 1.00 


.73 .58 


.88 .92 


069 


.1*8 


.1*3 


iiU.iI 


11.5 


V/ord-Cloze 


3 


2 


•88 


.29 1,00 ,1*9 


.22 c3k 


.59 


.59 


•77 


2.7 


2.1* 


Test 3 




3 


.93 


.73 : 


l.OO .73 


.58 .66 


.67 


.57 


.53 


U .2 


2.9 


Word-Cloze 


h 


2 


.85 


.31; 


,1(9 1.00 


.21* .39 


. 31 * 


.25 


c59 


3.0 


2.2 


Test lU 




3 


.93 


.58 


.73 1.00 


.u5 .59 


•65 


.56 


•6Q 


5.2 


2.9 


Letter-Cloze 


5 


2 


.27 


.93 


.22 .2lt 


1.00 .78 


clO 


•52 


.05 


17.4 


6.1* 


Test 7b 




3 


.55 


•88 


.58 .kS 


1.00 .66 


.56 


• 1 *U 


.23 


20.4 


6.1 


letter-Cloze 


6 


2 


.1*2 


.96 


.3k .39 


.78 1.06 


.07 


.63 


. 1 ^ 


24.1 


8.1 


Test Ub 




3 


.67 


.92 


.66 .59 


.66 1.00 


.67 


.38 


.1*8 


23.7 


7.2 


Language 


7 


2 


.55 


•08 


.59 .3^^ 


.10 ,07 1.00 


.50 


.51 


53.2 


5.7 


..ptitude Test 




3 


.71 


.69 


.67 .65 


.56 .67 1.00 


.61 


.71 


54.4 


7.1 


lQ 


8 


2 


■ .50 


.61 


.59 .25 


.52 .63 


.50 1.00 


.1*3 


114.2 


7*8 






3 


.61 


•1*8 


.57 .56 


•kh .38 


.61 1.00 


.1*6 


115.7 


10.5 


Grades in 


9 


2 


.79 


.11 


.77 .59 


.05 .15 


.51 


.1*3 1.00 


46.2 


32.2 


German 




3 


.65 


.1*3 


.53 .68 


.23 .U8 


.71 


•1*6 1«00 


48.4 


26.5 
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The reliability of cloze tests in high-school groups 

It will be recalled that in Chapter 3 it was argued that m odd-even 
reliability computation would not be appropriate for cloze materials— at 
least not for word-cloze materials presented in continuous paragraphs— 



because the two 





computed on the la-^ of the interaction term (persons x tests adjusted for 
variance due to passages) in a situation where seme eight to ten tests were 
available for each individual. For adult bilinguals, the standard error of 



measurement for a single cloze passage of 20 deletions was estimated as 
1.U6 (French) or 1.52 (German) and for a test of 50 letter-deletions as 
1.37 (French) or 2.06 (German) c Table 5.8 shows the standard errors of 
measurement for scores based on i;0 deletions (word-cloze) and 100 deletions 



(letter-cloze). 

For the high school samples, we have only the correlations between two 
tests; these tests are not strictly parallel because they are of differing 
difficulties. The intercorrelations between two non-parallel tests probably 
represent underestimates of reliability and, correspondingly, overestimates 
of the standard errors of measurement. Since no further data are available, 
we are forced to present these as the beeb a:^ailablB ’estinates Of reliability* 
Table 5.8 shows these data for both French and German cloze tests, with 
estimated comparable data for adult bilinguals™ 

The reliabilities estimated for the French word-cloze test of hO items 
range from .37 to .7li in various samples, to a considerable extent depending 
upon the variance of scores in these saiDpj.es « Values of «55 and oS6 are 
obtained for the School C and School D samples, respectively, with correspond- 
ing standard errors of measurement of 2.6 and 2^2. On the basis of these 
estimates, one may further state that a test of about 180 items would be 
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TABLE 5.8 



Estimated Reliabilities And Standard Errors of Measviremenf 



of Total Word-doze And Letten-doze Sdores, together Twith 
Data on the Intercorr elation of these T^'/o !Ifests 



Xr 



Word-Cloze Total (W) 
^ kO Items 



I Letter-Cloze Total (L) \ Corz> 



est. 

^ 1^4^ rel. ® iiieas. 



FRENCH 



School C 3 



26 



57 



k 

S j23 

r 

Total j 86 



School D 3 

k 



Total 



.23 .37 3.6 2.9 
.27 .1*2 3.0 2.3 



.59 .71* 5.1 2.6 
.38 .55 3.9 2.6 



38 .1*0 

1*1 



.57 3.6 2.1* 



79 



.36 .53 2.9 2.0 
.39 .56 3.3 2.2 



Adult Bilinguals .37 *51* 3.0 2.0 
(estimated)* (M=l6) 



GBPJiAN 



School E 2 
3 



21 * 

25 



.1*9 .66 3.9 2.3 
.73 .81* 5.1* 2.2 



Adult Bilinguals 
(estimated)* 



,70 .83 5.2 2.1 



(H = 22) 



100 Items 



est. 



for 
attenua.. 
tion 



n§'^ rel. ®meas. ’^WL 



.56 .72 U.5 6.1 
.51* .70 11.9 6.5 



.30 .1*6 10.6 7.8 
.51 .68 12.0 6.8 



.60 .75 13.0 6.5 
.51 .68 11.7 6.6 



.56 . 72 12 . 1 * 6.6 



.88 .93 7.5 1.9 

(R = l5) 



.78 .88 13.9 1*.8 
.66 .80 11.5 5.1 



.90 .91* 13.0 2.9 
(N = 12) 



Data from Chapter IH 



.30 .58 
19 .35 



.52 .89 
.35 .57 



1*8 .73 

.39 .65 



.1*1 .65 



.37 .1*9 
.70 .85 
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requifed to^ pi^oduce reliabilities of in these sai 4 ples, and students 
inrould have to he allowed least 6l Mnuted 'to oqmplete such a test^ basod 
6n our experience with n ^-minute time-limit with a 20-item test* 

The reliabilities fCr the Germah word-cloze test (also of hO items) nre 
somewhat higher than the French^ b^g *66 and *8h for the two samples from 
School E, but the standard errors of measurement are about the same as for 
the J^ench word-cloze tests— 2*3 and 2*2 in he two samples* respectively, 
indicating that the relative magnitudes o; the reliability coefficients are 
largely a function of different ranges of talent* 

The French letter-cloze tests for which reliabilities are estimated 
consist of 100 items; their reliabilities range from .h6 to ^75* typical 
values being *68 for School C data and *?2 for School D data, with oorrespond- 
ing standard errors of msasui'sinent of 6*8 and 6«6, respectively* Nevertheless, 
it would still be necessary to lengthen the test by about two and a half 
times to achieve reliability in the neighborhood of o8,^| that is, the test 
would have tb be about 2!*0 items in length* requiring about 30 minutes as a 
time-limit if we extrapolate from the time-limit of 6 minutes used for a 
^0-item test# 

Reliability was satisfactory for the Geiiiian letter-cloze test* 
Coefficients of .88 and *80 wore obtained, 
respectively* in two high-school classes in School E* 

From Table ^*8 it can be seen that for word-close the standard error 
of measurement found for the liigh-sohocl sample o comparable to tlie 
standard error of measurements estimated for adifit bilinguals* Tliifi is true 
both for French and German word-cloze materials. It may be concluded that 
the word-cloze test has a relatively imiform standard error of measurement 
for different parts of the score range. 
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For letter-cloze tests, however, the standard error of measurement is 
distinctly smaller at the upper levels of the score range reached ?dth the 
adult bilingualso This seems to be true, at least, if we can Judge from the 
lower Ofaeaso for both French and German bilinguals as compared with the 

Owoag. in high-school sample a- 

Intercorrelations of irord-c3.oze and letter-cloze tests 

Together with data on the reliabilities of the respective tests. Table 
presents data concerning the correlation between the word-close and letter-cloze 
tests* These statistics are pertinent to the q.usstion cf whether these tests 
measure the same kind of language ccnpetenco* Since the raw coefficients are 
partly dependent on the reliabrUities of the respective variables, correlations 
corrected for the effect of attenuat: .on have a3.so been presented* In making 
these corrections for attenuation, the estimates cf reliability made In Table 
were used^ since it is probable that these reliability values are uuder- 
estimates^ it is thereby probable that the estimates of non-attenuated correla- 
tions are The independent values of non-atteraiabed correlations 

presented in Table ^.8 range from *35 to with a median of *65. This value 
is far from unity, and in view of the prolvatiiity that it is even then an over- 
estimate of the non-attenuated correlation, vra may conclude that the word-cloze 
and letter-cloze tests measure somewhat different aspects of language achieve- 
ment* 

V alidi ty of the cloze tes ts as measure s of foreign language profi ciency 

From the limited evidence col.Iected in the present study it is impossible 
to give a satisfactory assessment of the validity of the word-cloze and letter- 
cloze tests as measures of foreign language proficiencyc A completely satis- 
factory Judgment on this question could be probably made only after extensive 
factor-analytic studies of a variety of measures of language prcficiency. Some 
tentative answers and suggestions arise from the present data, however* 
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One type of evidence of validity, of course, is the finding that the 
cloze tests differentiate between learners and those who have more or less 
completely learned, i.e. native or bilingual speakers of the foreign 
language in question. In a previous section it has been suggested that cloze- 
test scores measure on a ratio scale, and it has been pointed out that at 

* least for the French cloze tests, word-cloze and letter-cloze tests yield 
group mean scores which represent highly similar proportions of adult 

ft 

performance, buch results strongly suggest that the cloze tests are in fact 
measuring some important facet of foreign language proficiency— but this is 
much more true for groups than for individuals . That is to say, if we use 
group results to cancel out individual variations on all the extraneous 
factors which may contribute to the determination of cloze test scores, the 
group means reflect real differences in foreign language competence. 

Another kind of evidence of validity is to be found in the correlations 
of cloze test scores with teachers* grades. It is a common myth among 
educational psychologists that teachers* grades ai*e notoriously unreliable! 
but this does not seem to be true, necessarily, of teachers* grades in 
foreign language courses, which are frequently found to correlate highly 
enough with other variables to suggest that they are quite reliable. This 
seems reasonable enough when we consider that the student* s performance, 

* especially in courses emphasizing the audio-lingual aspects of foreign 
language teaching, is much more concrete c^>id readily apparent to any 

‘ 'Observer. Tables $.5, 5*6, and 5«7 contain numerous correlations of cloze 

% 

test scores with teachers* grades. The results are rather surprising.ly 
variable and it is difficult to make generalizations. Neverthe3.ess, as a 
general observation we may state that for the data with cloze tests in 
French there is a tendency for the word-cloze test to correlate higher with 
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teachers* grades than the letter-cloze test does. For all cases at School 
C, word-cloze total scores correlate *57 with grades, while letter-cloze 
scores correlate *37 with grades* At School D, the correlations with 
teachers* grades are *i|0 for the word-cloze test and only *17 for the letter- 
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tests as compared with the letter-cloze tests, one could assime that the 
word-cloze test has an even higher underlying correlation with teachers* 



grades than might otherwise appear* 

Similar results are obtained for the cloze tests in German* At School 



E, the word-cloze test showed correlations of *79 and *65 with teachers’ 
grades, while the corresponding correlations for the letter-cloze test were 1 
*11 and »1|3* 

These results, incidentally, constitute further evidence that the 
word-close and letter-cloze tests measure somewhat different aspects of 



language performance* 

From the standpoint of validating the cloze tests, and on the assump- 
tion that teachers* grades constitute a valid criterion, it may be tenta- 
tively suggested that the word-cloze type is more valid than the letter-clozs 
type* lord-cloze tests make greater demands on the ability of the learner 
to select appropriate words and grammatical forms to fit a context, whereas 
the letter-cloze test demands chiefly a sensitivity to the orthographic 
customs of a language system and a imowledge of the proper spelling of 



foreign language words* 

Further evidence pertaining to the validity of the cloze tests lies in 



the correlation.3 between these tests and the C3EEB language examination 
scores which were available for some cases* Tables 5*5 and 5*6 present 
several rows of correlation coefficients of part and total scores on the 
French cloze tests with CEEB scores* Again there is a surprising degree of 
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variation in the results^ but the wojrd-cloze test shows correlations as high 
as *7^ in one sample and as low as dO in another* These samples are very 
small, however* The lotter-c3.oze test has generally low correlations with 
the CEEB scores, ranging from *26 to •38* 

In an effort to obtain more reliable results, data vfore assembleu for 
all 76 students in the total sample (including both public and private 
schools) for whom CEEB French scores were available. (CEEB German scores 
were available for only 6 students and were not included in the analysis*) 

T>ie correlation matrix for CEEB French scores with the various cloze scores 
is to be found in Table 

Of the two types of close tests, the word-cloze test has a hi.gher 
correlation with CEEB French score, this correlation being *6l6 and the 
correlation for the letter-'Cj.ose test being If we use the inter- 

correlation between the two parts of each test as a basic for an estimate of 
their reliabilities, we obtain (by the customarj** Spearman-Brown formula) 
a reliability coefficient of v6$6 for the word-cloze score and *683 for the 
letter-cloze score* Using these reliability values and assuming perfect 
reli^ility for the CEEB score, we can estimate the validities which the two 
types of cloze test would have if they were perfectly reliable. THiQ find the 
correlations corrected for attenuation to be *761 for the word-cloze, and 
*1;29 for the letter-cloze score* These values suggest that neither type of 
cloze test measures exactly the same kind of achievement as is measured by 
the CEEB language test* 

This conclusion further reinforced by the fact that the multlp].e 
correlation for predicting CEEB score from the two cloze scores is only *619; 
^ust barely eater than the zero-order correlation between word-cloze score 
and CEEB* Thus, the letter-cloze score adds hardly anything to the prediction 
of CES3 score from the word-cloze score. Hi/hatever the letter-cloze test is 
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I TABLE 5*9 

Inter correlations of Cloze Scores and CEEB French Scores 

= 76 cases 

(Public and private secondary school 
students for whom CEEB scores were 
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6 
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Word Cloze Total 
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leCOO 




,868 


«856 
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.568 


.616 


Letter-Cloze Total 


2 


.k9h 


1.000 


.3i*9 


.^06 


.843 


.898 


.3t5 


Word-Cloze Test 3 


3 




c349 


1.000 


cm 


.3.58 


.1(2I( 


.1(12 


Word-Clozo Test 17 


k 


o658 


•5o6 


.1+89 


loOOO 


.299 


.558 


.655 


Letter--Cloze Test ?b 


$ 


.26h 


.eii3 


.158 


o299 


1.000 


.520 


.168 


Letter-Cloze Test lib 


6 


.568 


C098 




.558 


.520 


1.600 


.1(25 


CEEB Score 


7 


.616 


.3$k 


.1(12 


.655 


.168 


.1(25 


1.000 


Mean 




15o99 


U7.75 


9.75 
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measuring, it is not related to CSEB-tested language achievement beyond 
v»hatever is accounted for by the word-cloze test. 

The question of whether cloze-tests are better measures of foreign 
language achievement than CSEB-type tests can only be answered by research 
■vrtiich would assess their validities with reference to a more ultimate kind 
of criterion than was available in the present study* Nevertheless, from 
the evidence presented here, particularly the evidence (Chapter k) as to the 
kinds of verbal tests with which the word-cloze test correlates most highly, 
there is a suggestion that cloze-type tests are inferior as measures of 
foreign language achievement because they involve too much extraneous 
variance* 

Correlations of cloze tests with intell5.2ence tests 

Further insight into the natui’e of the abilities measured by the 
foreign language cloze tests is gained by an inspection of their correlations 
with intelligence tests or with ”IQ‘* ratings as found in school records* 

Data are available from School D, where French cloze tests were given, and 
from School E, where German cloze tests were givenj the statistics are to be 
found in Tables ^.6 and 7* 

In School D, the correlations tend to be higher for the letter-cloze 
test than for the word-cloze test, the values for combined groups being *kh 
and *26 respectively* On the other hand, at School E, the correlations are 
approximately similar i values of .61 and *U8 are obtained for the letter- 
cloze test as compared to values of *50 and *61 for the word-cloze test* 

Intelligence tests, as measures of verbal ability, would be expected to 
correlate rather appreciably with achievement in a foreign language, and this 
expectation is confirmed by these data. Nevertheless, to judge £rom the 
results with French cloze tests, verbal intelligence is more demanded by the 
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letter-cloze tests than by the word-cloze tests* One may speculate that 
successful performance on a lettar-cloze test depends to a considerable 
extent upon generalized verbal habits which relate to orthography and which 
ai*e transferred between related languages; it is probable that intelligence 
tests measure the presence of such habits* Word-cloze tests* on the other 
hand^ depend much more on actual knowledge of the language in question^ 
however acquired; the writer* s (unpublished) studies of aptitude for 
language learning show that intelligence is not necessarily related to 
success in language learning* 

Correlations of cloze tests wit h foreign 1 ang uage a ptitude tests 

For several samples, correlations of cloze and other tests with the 
Carroll-Sapon language aptitude test were available, as shown in Tab3.es 5*5 
and 5*7 • Close inspection will show that the pattern of correlations rather 
closely follows that of the correlations with teachers* grades* This result 
is not une:q)ecteds since the language aptitude test is designed to predict 
teachers* grades and since it does indeed correlate quite highly with 
teachers* grades in the present study, it might be expected to correlate with 
any variable highly correlated with teachers* grades, as the cloze-tests 
generally are. This result suggests that the correlation between cloze tests 
(particularly the word-cloze tests) and teachers^ grades reflects a common 
underlying variance which is central to foreign language learning; that is, 
that whatever else they measure, the cloze tests measure the central core of 
language achievement rather than some special variety of foreign language 
competence# This conclusion Is net inconsistent with the observation that 
the cloze tests nay not measure foreign language achievement very well, nor 
with the observation that the cloz ; tests may measure ^ a ddition some 
special competence in performing tests of the o?i.oze-tj^9, as has been 
suggested in Chapter 
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Item analyses of word-cloze tests 

For all available secondary school cases in PVench and in German^ item 
analyses were performed for the i;0 word-cloze items given to each of the 
groups* Item analysis consisted of finding the proportion of the total 
sample that got each item correct^ and the biserial correlation of the item 
responses with total score on l|0 items, Ihe results are shown in Tables 
5,10 and 5*11, the items being arranged in a series of somewhat ^ hoc 
grammatical categories. 

In terms of the proportions correct, it is clear that the "content- 
bearing” form-classes such as nouns, adjectives, and verbs are much more 
difficult to guess from context, while certain "function words” like prepo- 
sitions and relative pronouns are easy to guess. Nevertheless, there is no 
consistent pattern differentiating the various form classes with respect to 
item validity. High item validities are found for both content-bearing 
form-classes and for function words; likewise, low validities can be found 
for both of these broad categories. It is likely that the validity of a 
particular word-cloze item is a function of the degree to which it is 
suggested by the context, and we may even go so far as to suggest that the 
item validity can be taken as a measure of the strength of context. 

These item analyses suggest that in constructing word-cloze tests, there 
is little advantage to be gained by selecting particular kinds of words to 
delete. The use of any randomly selected words would produce results of the 
same nature. 

The use of commu-Tiity-of «re sponse scores 

Up to this point, a!).l the data reported are for cloze scores computed in 
the manner recommended by Taylors the score is the number of items supplied 
by the examinee which are exactly the same as those standing in the original 
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Item Reliabilities and Brbportl<»» Correct’, by Bart of %>eeoh 

Rrenc ^ t ^rd-»*CtL6ae Passages _ 

ktems i-20 'are ii^oa^asiBage Item# 21 -W' are £i*oci passage q F iO. 

K » 208 hi^ school students 
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(continued) 
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TABLE S.n 

Item Reliabilities (r^is) and P5:oportlons C©rreet# 
by Part’ of Speeclii German Vbrd-^oze Passage® 
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(continued) 
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text* In Chapter 2, honever^ It i^as argued that close eeoree bated on a 
coauiunity«of«re8pon8e eooring scheme might be a better method of meaturlxm; 
individual differences. A ccnmunity-of-response scheme gives credit to a 
response in proportion to the extent to fdiieh that response is also given 
by the other members of some representative sample of examinees* For the 
present study it was decided, for convenience, to compute a eonmunity-of- 
response score by counting as **correct** not only the item standing in the 
1 original text but also any other "reasonable” cosqpletion given by at least 
2S% of the sasple*^ For exan^le, in word close passage q F 3# Item no* 1 had 
the context ”La chef ne voulait pas etre surpass^ par les - - the 

correct answer to idiich was blancs a of the sample gave autres* however^ 
which was also allowed as an answer* Ihe additional answers allowed for all 
the word-close passages are as follows t 



5 F3 


qF 10 


1* autres 


1* grendi vieux 


2* les, des 


2* route 


9* les 


3* mol 


19* et 


7. pouvais, Toulaie 


20* chef 


6* longue, grande 




9* nous 




Id. d', par, avec 



* In determining idiat was a "reasonable” cosanonly given response, the data 

obtained from adults was also referred to* If 2S% of the adults gave a 
certain response, it was included among the new "correct” responses for 
students, irrespective of how many students gave this response* In all cases 
a response had to be spelled correctiy and given in the correct grammatical 
form to be counted as correct in the cowmunity»*ofwrespoaMe scoring* 
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To atudy* tht coRinunity-of^reaponso aeora^ it waa deeidad to draw a randoa 
eanqpla of 100 casea from the FVench 8dco!idary**aehooX groups only* No study 
of the eoiBmunity-^f-i^eponse iras done for German word«*cloze passages and 
there was no study of letter«cloze materials in either language* Coamunity" 
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▼alidity^ 



For the conventional scoring system the correlation between passages 
3 and 10 was •3361 yielding a stepped-up reliability of *5571. For the 
community-of-response scoring system the correlation was *1*591, yielding a 
stepped-up reliabiUty of *6293o To test the significance of the difference 
between the two raw correlations (*386l and «1*591), the procedure recommended 
by Peters and Van Voorhis ( 26, p« 285, formulas 10? and lOSaJ} was used* 
This necessitated setting up the matrix of intereorrelaticns among the four 
scores involved, shown in Table 5ol2* The test yielded a critical ratio 
of l*35i Tdiich does not allow the rejection of the null hypothesis that the 
two scores are equally reliable* The added cost and effort required in 
esthblishing a conanunity-of-response scoring key does not seem worthwhile, 
to judge from these limited vaults* 



Unfortunately, for the 100 eases for which coomuRity-of-response secret 
were computed, the only external measure available to judge the relative 
validity of the two scoring systems as a measure of individual differences 
was the Language ^titude Test (actually available for only 98 cases), 
already known to correlate rather wall with language achievement measures. 
The total scores on Passages 3 *nd 10 scored by the conventional method had 
a coirelation of .1*627 with the Language Aptitude Test; when the passages 
were scored by the community-of-reijjonse method the correlation was #26?C, 

The difference between the correla-vions, as tested by the procedure epeoified 
by McNemar (17 , p* 11*8, formula for t), is ,1997 significant at the 1 % 
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TABUS $.12 

Matrix of Ijitev’cactrclations among 
Conventional and C'oium\mity«(o.f-RjB3r.onse 
Scores for French ^o;p.d*Cl-Oft6 passages 
N » 100 secondary school students 
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Isvaly thus we are inclined to conclude that the conventional scoring aethod 




is a better neasure of individual differences in language achievement* 
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CHAPTER 6 

The Feasibility of Cloze Procedure in the Auditory Modality 



The bulk of the work which has been reported here, for reasons set 



forth in Chapter If has concerned the Use of cloae procedure in ’written tests 
Taylor {29) has shown, however, that just as the "readability” of printed 
materials can be evaluated by means of cloze procedure, so also the 
"comprehensibility” of auditory messages can be measuied by an analogous 
procedure. The present study being concerned with individual, differences, 
it was desirable to determine whetlier cloze procedure in the auditory 
modality is an effective measm’e of individual differences. This chapter 
reports an experiment designed not only to explore the feasibility of 
auditory cloze procedure but also to investigate the effect of varying rate 
and intonation in the presentation of auditory c3.oze materials. 

The study utilized cloze materials sols.^ in English, in order to 
facilitate obtaining suitable subjects. 



Stimulits Materials 

The English forms of two passages used in other phases of this project 
were selected. These were passages q 3E and used in the study reported 
in Chapter In order to increase the number of test items every seventh 
word was deleted instead of every tenth, and consequently each passage was 
divided into twenty-eight items. Each item consisted of six words preceding 
a deletion and four words following the deletion. Items subsequent to Item 
Number 1 thus started with the last four words of the previous item. Proper 
names were included in the text but were never deleted. liUhen a proper name 
was a seventh word the word following it was deleted. 
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?he 'tost itoms were ‘then recorded on magnetic tapeji they virere voiced by 
the experimenter. Two forms were prepared. One form is designated as the 
*’expressive*‘ form and the other is designated aa the ’'r.on-expressive” form. 
In each item of the "expressive” form the experimentez> after reading the 
item nomber, read the first six words with normal expressive intonation at 
his normal speech rate^ then a standard noise (produced by pressing the 
multiplication set*«up key of a Monroe desk calculating machine) was sub- 
stituted for the deleted word and the remaining four words were read "with 
normal intonation and normal rate. In the "non-expressiye^ form the experi- 
menter said the item number and then proceeded to utter the first six words 
of the item at the rate of one word per seconds The words were read in a 



staccato monotone. As before,, the noise of a desk calculator was substituted 
for tie de.Leued T^ord and then the 2*emaining four words of the item were read 
in tfc.e same way as the first six words had been read. In both the "express- 
ive” and ‘‘n'o-a-e-^pressivo" forms a pause of ten to twelve seconds was allowed 



before the tape went on to the next item. !Hie pause was used by subjects to 
‘write thejr answers. 



Test Booklets 

Each subject was presented with a test booklet which consisted of a 
"Practice” sheet, "which had four blank answer spaces, and two test sheets, 
consisting of twenty eight blanks for each of the passages. 

Instnction to Subjec-ts 

immm r ~Trr i i i iw a 

A tape with instructions for subjects, "which included a practice session, 
was played at the beginning of each experimental session* The text of this 
tape follows s 
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“This is an experiment to see how well people can replace words that 
have been taken out of spoken material. You do this perhaps uncon- 
sciously every- day when chance noises make Inaudible some of the words 
that are spoken to you and you have to guess what was being said* V«e 
want to explore systematically how people go about doing this* In order 
to get a record of the guesses you meike several samples of English tex^ 
were broken down into items* Each item consisted of several words; tlicr. 



a rattj.ing noise; and sevex'al more words© It uo jtouj. 
dovm on your answer sheet the word that you believe was covered up by 
the rattling noise* Time will be allowed for you to write your answer 
down before the test goes on to the next test item* You will not be 
allowed to hear an item more than once, but, because we are using con- 
tinuous passages of text, each new item will always include a few words 
of the end of the previous item* Let us practice this procedure a bit 
before we start the experimental tests* Use your answer sheet labeled 
"Practice”* Wite your naiae and get ready to write your answer opposite 
Item 1* - - Pause - - 

”Readyl Item One: ^Tvdnkie, twinkle little - -(noise)-* — how I 

wonder ’ - - Pause - - 

’’During the pause you should have vrritten the word you think should 
have followed the word * little*, as for instance, *star** 

"Item twos ’Kow I wonder - - (noise)- -you are. Up above*- Pause - 
"Have you written the word that should have followed *what*? 

"Item thi'ees *Ai*el Up above the - - (noise) - - so high|» like* 

- - Pause - - 

"Item four: *So high, like a diamond - - (noise)- - the slqr,* 

"The answers were, of course, Number One: ’star*; Number Two: *you*‘5«' 
Number Three: ’world*; Number Four: ’in*’ Notice that we never have 

more than one word for an answer* This is an important part of the test* 
You are to ..write only one word answers* 

"Now we are ready to go on to the experiment itself* The material 
will be unfamiliar, and, we hope, will be much more difficult*" 
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Subjects 

Twenty voliinteers were recruited as subjects from among the per- 
sonnel of the Laboratory for Research in Instruction and the Harvard 
Graduate School of Education* All subjects had had at least some 
college undergraduate training* Most of them were graduate students* 



* "Ylhat" was incorrectly taped into this place. Subjects were told 
the correct answer by the experimenter personally* 
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Experimental design 

i 

; A Latin square design was used and arranged so that ten sub;)ects were 

i. 

administered Passage A in the "expressive" form and Passage B in the "non- 
expressive" form® len other subjects were administered Pas sage A in the 
"non-expressive" form and Passage B in the "expressive" form. Within these 

T 

j groups of ten subjects, subgroups of five subjects Hach, were arranged so 

t 

that the passage administered first was either A "expressive", or A "non- 
expressive", or B "expressive", or B "non-expressive". Table 6.1 shows the 
schedule of experimental procedures for all the subjects. The structure of 
the design is also made apparent in the tabulation of results in Table 6.2. 

I 

Whenever feasible, five subjects were experimented upon at once. But 
i groups were not always to be obtained, and on occasion one subject was 

experimented upon at a time. Generally groups of two or three were experi- 
mented upon at the same +ime. 

The experimental procedure lasted about thirty minutes . Subjects 
were asked for their reactions after the session. The ensuing conversations 
lasted from, ten minutes to a half hour. 

Treatment of the data 

In tabulating the scores of subjects, an answer was scored as correct 
if, and only if, the word written by the subject was exactly the same as the 
word used by the author (or translator) of the passage. This criterion 
applied to the grammatical form of the word, as well. The sum of all such 
items scored as correct in a passage is the subject* s score for that passage. 

Statistical analysis 

(^) ^e_ effect of altering the speech pattern 

Tab3.e 6.2 is a tabulation of the raw data. Since the experimental 
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TABLE 6.1 

Schedule of Experinsntal Treatments 
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design obviates the necessity for taking into consideration in the statistical 
analysis passage difficulty or position effects, a simple ”t” of the differ- 
ence bet”^een correlated means can be applied to the sum of scores obtained 
from summing the rows of cells* There is no statistically significant 
difference between the means in question* Hence it cannot be concluded that 
experimental variation in the speech pattern affects the difficulty of 
auditory cloze. 

The number of subjects who answered an item correct3.y was taken as a 
score for each item. The scores in each condition were correlated with each 
other for each passage* The correlation coefficient, ”r”, between the 
"expressive" and "non-expressive” forms for passage A is equal to *8l85# 

The correlation for passage B is equal to .8590. It can thus be seen that 
the variation in the experimental conditions did not affect the p attern of 
difficulty of the test items to any appreciable extent. 

(2) Feasibility of the test form 

ks no effect due to e:q)erimental treatment was found, it is 
possible to ignore the classification of the data according to e:q)erimental 
treatment and to use the scores of each passage as though they conrorised a 
half-test in order to compute a reliability measure. The Spearman-Brown 
reliability measure yielded by this procedure is .831*2. Examination of the 
frequency distribution yielded by the test suggests that the scores are 
normally distributed. The data thus indicate that the test is a reasonably 
reliable measure of some kind of individual differences* However,, the sub- 
jective reports of the subjects col3-ected after each test session indicated 
that, generally speaking, the test procedure is found to be tedious, con- 
fusing and irksome. Subjects seemed to show a slight preference for the 
"eiqpressive" form. 
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Summary and Conclusions 

Two English auditory cloze tests were prepared on magnetic tape. The 
tests consisted of 28 items, each macfeup of six words, a deletion (signified 
by a noise) and another four words. The consecutive items of each test 
constituted a passage. Each passage was prepared in an expressive and non- 
expressive form. A Latin square design was used so that each passage and 
each form was presented to all twenty English speaking subjects. It was 
found that although the auditory cloze procedure proved to be a feasible 
measure of individual differences, as indicated from the reliability yielded 
by the data and the frequency distribution, the test form v^as unpleasant for 
the subjects. No difference was found between the expressive and non-express- 
ive test forms. This finding suggests that auditory cloze scores will remain 
stable over a wide range of differences in intonation and rate of speakers. 

It does not imply that intonation plays no role in comprehension for it is 
probable that intonation carries information on a band of communication not 
affected by cloze procedure. 




Summary^ Conclusions^ and Recominendations 



The present investigation was undertaken in order to ascertain whether 
"cloze** item types > in which the subject is required to restore a missing 
element in a string of continuous text> i^rould provide a useful, reliable, 
and valid technique for measuring proficiency in a second language. It was 
hoped that tests composed of this type of item would t (a) be cheaper and 
simpler to construct than conventional tests, (b) draw upon a broad and 
representative sample of language habits rather than on a few specific know-* 
ledges, (c) yield a rational scale for. measuring competence in terms of the 
extent to i^feich native performance in the language is achieved, and (d) meas-» 
ure accurately at the upper levels of proficiency, Tthere there is possibly 
a ceiling effect in the conventional item types. 

Certain difficulties were anticipated, however# It was expected that 
"guessing", which is inherent in the procedure, would tend to reduce relia- 
bility^ that "intelligence ’• and other dimensions of cognitive ability would 
prove to be disturbing variables; and that the ‘’readability" of the texts 
would be a source of variation. Nevertheless, it was reasoned trm the 
history of the use of the same and similar techniques, as well as from recent 
developments in information theory, that a measure of the ability to produce 
linguistic material consonant with the contextual constraints of the stimulus 
material might well prove to be an index of linguistic proficiency. 

The study was conducted as a pilot investigation in order to e:q)lore a 
number of facets of the problem. 

Two types of items were investigatedt word-cloze and letter-cloze.' 

The word-cloze items were modeled very closely after the materials used by 
Wilson Taylor (26) and named "cloze procedure" by him# They consisted of 
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passages of continuous text 205 words in length in which every 10th word was 
deleted* The letter-cloze items were an adaptation of test items used in an 
experimental study by Miller and Friedman (20) and consisted of strings of 
5> 1 $ 9> or 11 letters (Including spaces as letters) in which the firsts 
middle^ or last letter had been deleted* in both types of test^ It was the 
task of the subject to attempt to restore the linguistic materiel which had 
been deleted* 

Twenty word-eloze tests in £hglish» French, and German were drawn Arom a 
corpus of ten prose selections for iidiich English, French, and German versions 
were available* The basic materials ranged in complexity from Readers * 

Digest articles to Kant's philosophical writings* Application of one of 
FXesch's original readability formulas (12) showed that the sample was re* 
presentative of all Flesch readability levels* English letter-cloze materials 
were supplied through the courtesy of Miller and Friedman, and French and 
German letter-cloze materials were developed by applying Miller and Friedman's 
method to appropriate texts* The materials were presented in typewritten, 
mimeographed form* Deletions were always indicated by the underline charac- 
ter of the typewriter* 

A word, in isiinglish and German, was defined as a string of letters bounded 
by a space at the beginning and end* In French, allowances were made so that 
morphemes such as ^ (e*g« in d' argent ) could be counted as words* In most 
studies conducted here, responses were scored as correct only when the 
deleted word or letter was replaced exactly as it appeared in the original 
test* One study tested the feasibility of using a community-of-response 
criterion* 

Groups of English-French and English-German adult bilinguals of rela- 
tively high academic achievement were tested with these materials in r- 
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to establish a norm of native^speaker performance and to compare results for 
the three languages. The mean scores of passages ranged rather widely and 
significantly as expected. Versions of the same material varied in their 
relative difficulty over the three languages. The individual passages 
varied^ too^ in their ability to predict the total score on all passages. 

Total reliabilities of word*cloze tests consisting of 9 passages in the same 
language ranged from to tTOli. Scores on letter«cloze tests proved to be 
dependent upon the number of letters in the string and the position of the 
deletion. Reliabilities of these tests also varied considerably. The best 
reliabilities were yielded by items in which the deletion occurred in the 
middle. 

No significant differences were found between word«cloze scores in the 
three languagesi although French and German letter-cloze scores proved on the 
whole to be lower than English letter-cloze scores. These findings might 
easily be attributed to peculiarities of the relatively small sample both of 
persons and of passages^ but they also seem to be in accord with esqpectations 
based on information theor^^on account of the larger effective alphabets of 
French and German when account is taken of diacritical marks. 

In addition to these findings about the nature of the tests^ the studies 
of adult bilinguals indicated that native speakers of a language show con- 
siderable variation in their ability to restore texts in their native 
language. Their ability to restore texts in a second language ^ in which they 
have near-native proficiency^ is slightly (but significantly) poorer than 
their ability to restore texts in their native language. Further^ the scores 
in the two languages were substantially correlated with each other. This 
suggested that the ability to restore texts is somewhat independent of 
eoQ^etence in a language as ordinarily defined. 
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To further ascertain the characteristics of cloze tests several studies 
were conducted in English on native speakers* In a study in which the 
subjects were harvard i^per classmen and graduate students it was found that 
the measure of the difficulty of the passage is sensitive to the partibular 
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were used* This finding is in accord with Taylor* s (27) finding for passages 
of 175 words containing 35 deletions* In the present stuc^ the sensitivity 
of scores to the particular word deleted was demonstrated by comparing the 
adjusted means of passages in which every tenth word was deleted with the 
adjusted means of the same passages in which the deletions were moved over 
by one word* The adjustments were made by means of applying analysis of 
covariance to the data in which a third (control) passage was given to both 
the experimental and control group* It was also shown>in a similar experh« ' 
mental design^ that subjects depend considerably upon contextual cues from the 
total passage* In this case the experimental manipulation consisted of 
d,'*''' .rig the passages into ten-word items (with the sixth word deleted) and 
scrambling the sequence of these items* 

The data from the last- mentioned experiment made it possible to examine 
what kinds of items, from the point of view of syntactical structure, were 
most susceptible to the Influence of paragraph cues* It was shown, within 
the limits of the sample of items available, that prepositions and adjectives 
(i*e*, **funotion words**) tended not to be affected by paragraph cues while 
nouns and verbs (i*e*, **content-bearlng words") were susceptible to influ- 
ences from the paragraph* When the scores achieved on a test consist ing 
20 totally unrelated items, taken from the entire saople of items available, 
were compared with the scores obtained on continuous passages and on scrambled 
passages it was apparent that this diminution of contextual material further 
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decreased the scores* The fact that paragraph context acts as an extraneous 
source of variation is further i.ubstantiated by the fact that reli^ility of 
scrambled passages is considerably higher than reliability of connected 
passages* However, it was determined that there was no "cumulative*' effect 
of context; i*e*, the degree to which context determined the responses to the 
last eight items of a paragraph was no greater than that for the first eight 
items* 

Despite the fact that Taylor named the cloze test because it involves 
malcing : "closure" in the sense of the word established by Gestalt psycholo- 
gists, the ability to do cloze tests failed to shew any correlation with 
Ihurstone's (12) "Pour Letter lillord Test" which is reported to show a high 
loading on the "speed of closure" factor. Data collected in an independent 
factor analytic study by Weinfeld (32) using 9th, 10th, and 11th grade school 
children made it possible to show that the ability to do cloze tests in one's 
native language is related to reasoning, to verbal ability, to the ability . . * 
to write good themes, and to expressive fluency* 

Students from, five public and private secondary schools in the Boston 
area were used to test the feasibility of cloze tests as measures of foreign 
language proficiency* Two 20-deletion word-cloze passages of appropriate 
difficulty and reliability, and two sets of letter-cloze mateiials consisting 
of 9- and 11-letter strings with the middle letter deleted, were administered 
in the respective language to 20$ third, fourth, and fifth year students of 
French, and to 63 second and third year students of German* The time limits 
were sufficient to allow most students to attempt the last item of each of the 
k tests; the total testing time for the four tests including instructions was 
36 minutes* The students appeared to accept the tests readily and were able 
to carry out the instructions* The tests were scored by allowing credit 
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■whenever the student restored the text exactly as it had stood in the original. 
Collateral data^ secured t^erever possible^ included scores on the 
Carroll-Sapon foreign language aptitude test, teachers’ grades in the language 
courses, students* academic averages, intelligence test scores, and scores on 
College Board foreign language examinations* 

On the two word«*cloze tests combined, with a maximum possible score of 
liO, third-year French students bad an average score of l4c^, and fourth-year 
French students had an average score of 16»1, as compared with adult native 
average performance of 27«ii« On the two letter-cloze tests, with a maximum 
possible score of 100, third-year French students hall average scores of h3*7$ 
and fourth-year French students had average scores of as compared with 

adult native average scores of 78*1* Thus, third-year French students do 
about 53^ to 56?^ of adult performance, while fourth-year students do about 
58^ of adult performance, in both word-cloze and letter-cloze tests* The 
similarity of the percentages suggests that the cloze-procedure provides a 
rational scale along which performance can be measured in meaningful units* 
Somewhat different results were obtained for the high school students of 
German, but it should be pointed out that these results are for 2nd and 3rd 
year students rather than for 3rd and lith year students as in <^ench« Second- 
year German students made average scores of 7*2 (2k% of adult native per- 
formance) on the word-cloze test, and 46.7 (61^ of adult native performance) 
on the letter-cloze test* Third-year German students made average scores of 
10*2 (35^ of adult performance) on the word-cloze test, and U?*8 [62% of 
adult perfomance) on the letter-cloze test; thus the constant proportion- 
ality observed in the case of the French results failed to occur* 

Although there were wide individual differences in perfoxmance, there 
were no significant differences between the average performance of adjacent 
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hlgh*8chool year»groups on any of the teats* Either the instruction itself 
fails to cause significant gains trom the 2nd to the 3rd or trm the 3rd to 
the [4th years 9 or the present teats are insensitive to the gains* ^everthe* 
lesSj high school students are significantly different from adult native 
speakers in their performance* 

Ihe reliabilities of the word-cloze tests in the high school samples 
were only moderate* For the liO*item French word-cloze test they ranged from 
«37 to *7i( in various samples^ and for the UO-item German word-cloze test 
they were «66 and *61^ in two samples* Standard errors of measurement for 
these scores were fairly constant^ ranging from 2*2 to 2*6* The magnitudes 
of the reliability coefficients were largely a function of different ranges 
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of ability* 



The reliabilities of 100-item letter-cloze tests ranged from to *75 
for French and from »80 to 088 f^r German* Standard errors of measurement 
were distinctly lower in German, being i;*8 to 5*1 as compared with 6*6 to 6*8 
in French* 

The reliability of cloze-procedure materials is such that in order to 
assure the achievement in typical secondary-school groups of a ic^xiability 
coefficient of *65, say, it would be necessary to administer a word-cloze 
test of about 180 items (of 205-word paragraphs) with about an 80-minute 
time-limit; for 3.etter-cloze, tests of 2^0 items with a 30-minute time-limit 
would be necessary* 

The study provided abundant evidence that paragraph-length word-cloze 
tests and letter-cloze tests measure some\^at different aspects of seoond- 
ianguaga proficiency* In both French and German- correlations between word«> 
cloze and letter»cloze tests were far from unity even after correction for 
attenuation* This was true, Incidentally, both for high-school students and 
for adult native speakers* 












The study Involved numerous ^omputationo of correlations between olo»e-» 
test scores and vaicious other measures of foreign language aohieveaentj but 
it cannot be said that a completely satisfactory assessment of the vali<itty 
of word-close and letter-close tests was made* This irould require a criterion 
considerably more ultimate in its nature than any that mere available. 

Favoring validity is the evidence that close tests differentiate significantly 
betwen learners and those vho have achieved native mastery bf the language. 
Also favoring satisfactory validity is the evidence that at least the 
eXoie tests correlated quite substantialiy with teachers* gradesi letter^ 
close scores tend to correlate upich lower with teachers* gradesi It may be 
suggested that word-close tests make greater demands on the ability of the 
learner to select appropriate words and granmiatical forms to fit a eontaxtf 
whereas the letter-close test demands chiefly a sensitivity to the orthogra- 
phic customs of a language system and a knowledge of the proper gelling of 
foreign language words. 

Word-close scores also correlated reasoxiabiy well (r * *6l6, N * 76) 
with 5k*ench scores^ 'vdiereas the letter-cloze scores correlated with 
thtae scores only to the extent of .35U added nothing significant in a 
multiple correlation. Nevertheless, the corrections for attenuation suggest 
that neither the word-cloze nor the lettor-cloze test measures the same kind 
of proficiency as the CEEB language examination. The results reported in 
Chapter are si..*ong reasons for believing that close procedure tests depend 
to a considerable extent upon co^itive ability variables which are ©cm- 
pletely extraneous to foreign language success. That is to say, even an 
individual who has good mastery of a foreign language may not be able to 
demonstrate this mastery on a cloze-procedure test if he lacks certain other 
Intellectual qualities such as reasoning ability and ideational fluency. 
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This conclusion is further supported by the fact that cloze tests in 
foreign languages had substantial, correlations "with intelligence tests, in 
several high school samples. This was particularly true of letter-cloze 
tests, which had correlations of M to .61 in several groups, as compared to 

correlations of .26 to «6l for the word-cloze test# 

The pattern of correlations between cloze tests and language aptitude 
test scores is similar to that between cloze tests and teachers* grades; 
that isj the correlations of language aptitude scores are ioigher with word- 

cloze scores than with lettsr-cloze scores# 

Item analysis of the word-’Cloze tests administered in the high schools 
revealed that content-bearing form classes, such as nouns, adjectives and 
verbs are much more difficult to guess from context, while certain function- 



vrords like preposi.tions are easy to guess. This easily anticipated result 
is in accord with the f inaings of the item-study of the experiment in T/diich 
the items within passages were scrambled, ^'or purposes of test construction, 
however, it is important to note that both high and low item validities were 
found for both content-bearing form classes and for function words, so that 
little woiO.d be gained from systematica3.1y selecting one item type or another. 
The conventional scoring system of counting as correct only whe word 
given in the original text was compared with a community-of-response scoring 
system for French word-cloze data* The community-of-response scoring system 
yielded a higher reliability (using the passages as split halves of the test). 
The difference between the two reliabilities, however,, was not statist c. cal 
significant. On the other hand the conventional scoring yielded a signifi- 
cantly higher correlation with CjSEB scores than the community-of-response 
scoring system, thus suggestrlng that the conventional scoring system is more 
;-alid, or at least that the v/ords used by the authors or translators are 



more ia accord with the expectations of the writers of the CSE3 tests than 
are the vfords used by the group whence the community of response was derived* 
To e:?tplore the feasibility of testing in the auditory modality, two 
English auditoi’y cloze tests were prepared on magnetic tape* The tests 
consisted of 28 items, each made up of six words, a deletion (signified by a 
'Standard noise introduced into the tape), and another four word>3. con- 
secutive items of each test constituted a passage* Each passage was pre- 
pared in an expressive form— using normal intonation and speech rate—and a 
non-expressive form-using a standard intonation pattem for each word, and 
a constant rate of presentation » A Latin square design was used bo that each 
passage and each form was presented to all twenty English-speaking subjects* 

It was found that although the auditory cloze procedure proved to be a 
feasible measure of individual differences, as indicated by reliabilities 
yielded by the data and by the frequency distribution, the test form was 
highly unpleasant for the subjects^. Ko difference was found between the mean 
scores on the expressive and non-expressive test forms* This finding suggests 
that auditory cloze scores will remain stable over a wide range of differ- 
ences in intonation and rate of speakers of the material presented to the 
subjects* It does not imply that intonation plays no role in comprehension 
for it is probable that intonation carries information on a band of communi- 
cation not measured by c3.oze procedure* 

Conclusions 

The word«cloze and letter^cloze procedures developed here may be 
suitable testing devices to assess group differences in second- language 
competence, but they are inadequate as measures of individual differences 
because they are relatively unreliab3.e and are too heavily affected by 
various sources of extraneous variance* They require large amounts of 
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examining time and lend themselves well only to testing in the ivrittexT. 



language. 



Vnord-cloze and letter-cloze procedures as developed here are 
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to develop them from suitably chosen texts® Since they require free responses,, 
however, they are somewhat cumbersome to score even though the scoring can 
be made completely objective# 



Recommendations 

lo It is not recommended that word~cloze and letter-cloze tests of the 
tjTpe investigated here be seriously considered by the GEES as measiires of 
foreign language achievement for use in Bojvrd examinations # 

2 3 There are certain suggestions in the present studj^ as to the lines 
along which further investigation might profitab3.y proceeds 

a# It may be suggested that the sources of extraneous variance in 
cloze procedure might be controlled either statist j.oally — by 
adjusting for the individual's “ability to do cloze tests" as 
measured by a test or tests in his native language — or experi- 
mentally— by searching for new t 3 ^es of cloze tests which are 
minimally dependent upon extraneous sources of variance# 
b# word-cloze tests consisting of relatively small segments of a 
text— say, ten-word items-— may be a suitable way to reduce the 
influence of extraneous variance v- Such tests would operate by 
minimizing the extent to which the examinee could utilize broad 
semantic cues afforded by paragraph-length context; the examinee 
would thus be forced vo rely upon more purely linguistic cues 
such as the syntactical structure of the material# 
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c. The search for new types of cloze-procedure tests should con- 
centrate on tasks in which native speakers of a language perform 
in a relatively uniform manner, but in which language learners 
progressively improve o 

Uo *ne use ox a muj.t-ip.Le-ciiOJ.cw type oi t:xOr.e p.vuueuujie, jlu twixw-a w*io 



correct response is offered among a number of alternatives, 
deserves investigation# This type of item would have the advan- 
tage of possibly reducing the extent to which extraneous variance 
affects scores, while at the same time having the disadvantage 
of failing to test the examinee* s ability to pr oduc e responses# 

It would al.so have the practical difficulty of requiring test 
construction efforts in the creation of alternatives and in stand- 



ardizing scores ;> 

e# If cloze-procedure is defined as any procedure in which the 
examinee is required to supply an element which will properly 
restore a mutilated text^ it may be suggested that other kinds of 
cloze-procedure tests than the one investigated here might 
deserve investigation. For example. It might be found that 
Ebbinghans*s original procedure of requiring the subject to supply 
missing syll a bles, rather than letters or words, cou3.d be a useful 
technique for measuring foreign language proficiency without at 
the same time measuring vax'ious intellecjjual traits# 

f • Besides cloze-procedure, there are undoubtedly many other ways of 
attempting to get at the individual * s luiowledge of the character- 
istics of a language structui‘8 without resorting to conventional 
test techniques As examples of tec.hniques v^ich might deserve 
to be explored we may mentions: 
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(1) Tests of the individual’s ability to discriminate 
graiomatical from ungraEimatical sentences in the language (regard- 
less of their degree of meaningfulness or nonsensicality), or to 
discriminate noimal from mutilated texts* 

(2) Tests of the individual’s ability to judge the relative 
frequency of lingaistl6 items in a language, as compared viith the 

ability of native speakers to do so« 
g* Perhaps the most effective way of studying the above possibilities 
would be to use a factor-analytic approach which would study the 
intercorrelations of a wide variety of test techniques in a 
suitably large sample of language learners or native speakers* 

The stu^ should include measures of a?ul four skills in language 
proficiency-“15.stening, speaking, reading, and vaulting, as v?ell as 
both conventional and novel kinds of testing techniques* 
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APPENDIX A 



NAME BOOKLET NUi\©ER 

Please describe below (1) when^ (2) where and (3) und er w h at circum- 
stances you learned German# (l^T^iidicate whether German is native 
language# 

( 1 ) 




( 3 ) 

(It) 



Please describe below (1) whena (2) where and (3) und er Yhat ciroiM” 
st^ces you learned Englisho (It) Indicate’"^iether English "is your native 
language# 

( 1 ) 

( 2 ) 

( 3 ) 

ih) 



Try to give an estimate below of your fluency in reading^ writing and 
speaking German# Indicate whether you have read widely in all kinds of 
German writingj whethdr you have read a good deal but only in certain areas; 
whether you have hardly had a chance to read German, etc# We are particularly 
interested in how much experience you have had in reading and writing German# 



Fluency s 



Heading 

Writing ^ 

peaking 

Extent of Reading 

Try to give a similar estimate of yoiir fl.uencies in English ^und compare 
them with your Geiman fluencies# le are particularly interested in how much 
experience you have had in reading and writing English© 

Fluency s 

Reading 

Writing ^ 

Speaking 

Comparison of German and English Fluencies . 
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APPEKBIX-B 



Fill the blanks with a word -vdiich seems to fit best* 



1» these creatures wear self- 



2* If on closer examination 



coverings^ to be sure of 

+,ln n+. +.h^SP^ 

OilUUJLVA P* v-r w w v 

despite their lack of language 
suppose that in searching for 



3, regard these creatures as ^ 

1;. earth, the pygmies* Let 

nothing can be a language, either of sounds 

6o and to consider them 



be closely related to the 
in their social intercourse# They 
8o ''pygmies" possess a fine tail, which they use in 



7* they do not employ 



9o hitherto unknovm primitive tribes 
tedly upon some 



10* would confirm the impression 
13.* we have attributed greater 
12* make a thorough study 



expedition stumbles unexpec- 



they were human beings* We 
to clothing than to speech* 



the most primitive people on 



13* hypothetical speechless primitive men, 
that 



it then be discovered 



lit* the most primitive sort. 



• protect themselves or, from 



feeling 



15* pass over the fact ^ 
creatures 

16 o or of gestures* On 



17* of modesty, to cover 
18. to the pygmies, it _ 



in classifying these problematical 

ground of their external similarity 
of their bodies, this discovery 



decided, with some misgivings, to 
19. make emotional sounds somewhat the apes; but they have 



man-like creatures vjho differ 



the pygmies only in that 







20 * 
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APIBJDIX C 



Name: 






I 



Age: 



French Class 



High School Class — 

How 3,ong have you studied French? .. 

Have you ever studied any other foreign language? ^ yes no 

* (check one; 



jr w 



If *yes* , what other foreign language did you study? 

For how many years? 



Is a foreign language spoken in your home? yes — 

(check one 



no 



INSTRUCTIONS FOR PilRT I 

Part I of this test is made up of two French passages, each from a 
different booko Every tenth word has been replaced by a blank, lour task is 
to read through the passage to see what it is about, and then try to fill in 
each blank vdth a French word that makes sense* The blanks are all the same 
length, but remember that the words which have been taken out may be short or 
long. As some blanks will be easier to fill than others, it is a good idea 
to do the easier ones 7irst, and then go back to work on the harder ones, if 

you have time* 

Do one page at a time. Ihen you finish one page, wait until yoiir in- 
structor tells you to go to the next. Do not look at the first page until 

your instructor tells you to begin* 

Work rapidly, because you have only nine minutes for each passage! 



ERIC 
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APEBNDIX C (contft) 



Le chef ne voulait pas Stre SErpasse par les 



3 

^ aussi 



descendit-il de cheval et prit-it Igalement 



batons^ mais Komalo 



voyait bien qu*il ne comprenait 



tout \ fait de quoi il s’agissait 



Jarvis 



paraissait avoir la direction des operations planta l*un 



batons dans le sol et le chef en passa 



^ I'un de ses 



5 



conseillers en lui disant 



chose a Alors le conseiller planta lui 



aussi son baton 



le sola mais l*homme It la boite sur 



pieds cria: 



—Pas ll, pas 1^^ enlevez-le. Sur 



le chefj embarrass! et 



10 



sachant de moins en moins 



resta. laissant 



faire, remonta sur son cheval et y 



11 



12 



Une heure s’ecoula 



hommes blancs planter lerus batons# 



bout de laquelle tout un diploiement de 



13 



batons et 



plus 



lU 



drapeaux se dressait, et Koumalo regardait toujours, de 



plus stupefait# Jarvis et le magistrat parlaient ensemble et 



15 



a se designer tour ^ tour les collines et 



van.ee 3 Puis 



1 ' 



ils s* adressSrent au chef, tandis que 



18 



conseillers ecoutaient grave 



ment et attentivement leur conversation* Koumalo 



Jarvis dire au 



19 



magistrats — C*est trop long. Le 



haussa les Ipaules en disant: 



20 



• # t • 












smsa 
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APPENDIX C (cont.) 



J»ai g;.impl tant blen que mal srir un 



Q F 10 

si^ge assez mal commode 



et presque aussitot la longue 



\ laquelle nous f aisions face a paru 



bondir derril^re 



tandis cue la haute voix du moteur s*elevait 



cesse jusqu’lL ne plus donner qu^une seule 



, d*une extra- 



ordinaire puretl* Elle etait comme le chant 



la lumilrea elle 6t^t 



la lumilre meme, et je 



la suivre des yeux dans sa courbe immense. 



sa 



ascension. La Paysage ne venait pas a nous, il 



ouvrait de toutes parts, et un peu au-dell 



route, tournait majestueusement sur lui- 



glissement hagard de la 



10 



, ainsi que la porte d»un 



11 



autre mondei 



J» 



bien incapable de mesurer le chemin parcouru, ni le 



12 



• Je sais seulement que nous allions vite, tris vite. 



Ik 



plus en plus vite# Le vent de 3.a course 






^tait plus, comme au 



debut, 1* obstacle auquel je 



IF 



appuyais de tout mon poids, il 4tait 



dei^enu un 



vertigineux^ un vide entre deux colonaes d*air brassies 



17 



une Vitesse foudroyante* Je les sentais rouler ma 

18 ““ 



et 



I ma gauche, pareilles k deux murailles liquides, 
d»ecarter • • » • 



lorsque j*essayai 



20 
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APPENDJI C (cent.) 



IMSTRUCTIOkS for PiiRT II 



On the pages that follow are words and parts of words. The first page has 



-.oes not necessarily begin at the beginning of a word or end at the end of a word^ 
and the examples are completely unrelated to each other. The middle letter of each 
example has been replaced by a blank. 

Try to put a letter in the bl§nk that makes sense. Besides the ordinary 
letters of the French alphabet, there are two other symbols you may uses the 
apostrophe, and which stands for the space between words. 




Sample problems 



Answers 



DEUX* EURES 



DSUX*HEURES 



ENTIO *DE*V 



ENTI0N*DE*7 



SO^F tefrE 



SON*FRto*E 



PLU *TA 



PLUS*TA 



*QU ELL 



*QU»ELL 



CIE CE* 



CIR*CE* 



D® not look at the next page until your instructor tells you to begin. TWhen 
you finish one page^ wait until you are told you may go on to the next. 



!2f 

FRANgAIS 



APPENDIX C (cont.) 



= F 11 b 



SYMBOLS 



AAA 


BCCDBEEEFGHI 


JKLMNOOPQRSTUUDVWXIZ ** 

(Espace) 




OURS*_’ EXCE 


26. 


UN^DE I^SOM 


2. 


QU» EL_E*I^P 


27. 


TOUT*_U*MOI 


3. 


CTERI_TIQUE 


28. 


ENT*P_R*QUA 


k. 


ME*GU_RE*AV 


29o 


TEUR*_* ARRE 


5. 


AGE*D_HIVER 


30. 


*UN*A_TRS*M 


6. 


*CONT_NTES* 


31. 


*SE*R_CHAUF 


7. 


ERIVE_,C * EST 


32. 


NS^«-AP_BS^^LE 


8o 


C*LA'5<‘ LUS>G 


33. 


*FLOT_ANT*D 


9. 


ME*ET_QUE-5^S 


3U. 


NALEM_NT*DE 


lOr 


C*IMP_TIENC 


35. 


L»AUD_TOIRE 


11. 


ELLE*_E*REG 


36. 


YAKT*_N*N ‘ A 


12. 


* J E * N _ S A X S 


37. 


CIEUS_S-J^ET* 


13 c 


CREUS_IT*PO 


38o 


IS*A*_A*REC 


}ll« 


0 R T A N _ E I 1. -Jf- 


39. 


RS*JO_RS*OU 


15. 


S*ET*_ILLES 


1;0. 


ne^se^urite 


16. 


L-JtN* E_T*PAS 


Ijl. 


LLE*C UT*QU 


17. 


SIQUE^EWTRE 


ii2. 


ND*JE_LUI*D 


18. 


T*IMP_RTANT 


ii3. 


AIS*S_UVEE* 


19. 


OPTEP, _DEFIN 


iiii. 


A-J<-CHA_TER*S 


20. 


*MOIN_RE*PR 


1^5. 


D*BAG_GE*A* 


21, 


QUI*E_T*HUM 


1^6. 


*ET*D_UKE-J^B 


22. 


UN*J)_R*NOU 


i^7. 


DONNA_LES*C 


23. 


ANT*D_*RAME 


i;8. 


CARIB_U*COM 


• 

CM 


COUP*_NE*TE 


1|9. 


^fELLE_NE*SE 


25. 


ENS*S_ECROU 


50, 


*LUI*_VAIT* 






ERIC 
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FRAN(?AIS 



SYMBOLS 

AAABCyDEEEEFGHIJKLMNOOPQRSTUlJUVWXXZj.* 



1«. 


* Q U _ T U 


26. 


AS* 

4 


I E N 

• > 


2. 


\ p 

A ^ P S S E» 


27- 

— f «r 


V B E MAR 


3* 


* L A ^ 


, V I L 


28. 


ION 


P u 


It. 


CAPABLE 


29. 


0 R * _ E T T 


5* 


E * D ^ R B * 


30. 


* M A 

4 


s * c 

urn ^ 


6. 


0 U J _ U R S 


31. 


U T * 

\ 


A * R 

Ui. 


7. 


T 0 L E 


32. 


D E * 


_ E * P 


8. 


C 0_ME N 


33. 


N * M 


_T * M 


9* 


T R E _ P I E 


3lt. 


DIG 


_ E * A. 


10* 


BI E_*M A 


35. 


LA* 


_ 0 R T 


11. 


I N Q _ I E T 


36. 


* K U 


_ * A * 


12* 


I N T _ M I T 


37. 


NCR 


R * L 


13. 


A V E U N 


38. 


R * M 


_I S * 


lit. 


* N 0 ^ 


^ S * S 


39. 


* T ft 


_I S * 


15. 


T * I S * R 

'■im 


ItO. 


L < U 


_E * S 


16. 


E T # _ A N 0 


ill. 


N G £ 


_E T ♦ 


17. 


E * L ^ S * V 

f 


2t2< 


* P A 


^ A * R 


18. 


0 M E _ T * M 


lt3. 


AN* 


_ * E M 


19. 


A N D _ R * A 


lilt. 


SAT 


.ON* 


20. 


S * V _ Y 0 N 


It5. 


MAX 


^ * A * 


21. 


M £ « H E Z 

mm 


it6. 


V I L 


^ E * A 


22. 


C I * ^ I T * 


Ii7. 


0 M M 


_ D E * 


23. 


0 U T _ D E * 


ltd. 


* T 0 


_ J 0 U 


2lt. 


* D E _ N 0 U 


It9. 


* A * 


_ A B L 


25. 


N T R B A I 


50. 


A R D 


R* C 
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APPENDIX D 



NAiffit 

High School Class* Cterman Cl^ss 

How long have you studied German? years 

Have you ever studied any other foreign language? yes No 

^heck oneT 

If »yes», what other foreign language did you study? 

For how many years? 

Is a foreign language spoken in your home? yes No 

fCheck one} 

If »yes* , -sriiat language? 



INSTRUCTIONS FOR PART I 

Part I of this test is made up of two German passages^ each from a 
different book. Every tenth word has been replaced by a blank* lour task 
is to read through the passages to see what It is about> and then try to 
fill in each blank with a Geiman word that makes sense* The blanks are all 
the same lengthy but remember that the words which have been taken out may 
be short or long* As some blanks will be easier bo fill than ot/«iersj it is 
a good idea to do the easier ones firsts and then go back to work on the 
harder ones, if you have time. 

Do one page at a time. When you finish one page, wait until your 
instructor tells you to go to the next* Do not look at the first page until 
your instructor tells you to begin. 

Work rapidly, because you have only nine minutes for each passage I 






Nun wollte sich der HSuptling von den welssen llSnnem 



APPENDIX D (cont* ) 
aussteehen 



laseeni also stleg er auch vom Pferd und 



paar St'dcke^ aber 



Kunalo sah wohl^ dasa er 



recn'p 'tmss^e^ 'sras eigenoo^ien vorgxngi 



Jarvis, der die Sacha 



T 



leiten sehien, steckte elnen Stock in den 



Bpden, und 



T 



HSuptling gab einem seiner H^te einen Stock und sagte 



au iba* iUso steckte dieser Mann auch den Stock 



den 



Bocen, aber der weisse Mann nit den Hasten 



Nlcht dort, nicht dort; nion den 



drei Belnen rief t« 



8 



weg» Der Mann sah sweifelnd und 



zBgemd den Hai^tling 



, und der sagte ^gerlichi* Nicht dort, nicht 



10 



dortj nimn 



weg« Ikid dann stieg er wieder auf sein Pferd, 



11 



12 



und noch weniger als vorher begreifend, und sass da 



weissen Manner ihre Stbcke in den Boden 



Hess die 



13 



ur 



So verging eine Stunde, wbhrend derer sich eine ganzo , von 

i§ 



Stocken un Fahnen bildete, und Kunalo sah zu 



16 



Uberhaupt nicht, was eigentlich verging* Jarvis 



^sste nach ide vor 



der EUrgenneister 



17 



standen beieinander, und sie deuteten Inmer den Bergen hinauf 

S3 



und dann wandten sie sich 



deuteten ins 7al hinunter* Dann 



19 



sprachen sie mit den 



, und die RXte standen dabel • • • • 



20 
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APFENDIX D (cont*) 

9 0 lU 

Widerwillig Bffnete JocJy die Augen* Manchmal Uberlegte er sich. 



schon es sein nnisste* in den Wald zu ©ntwischen „ 

1 2 



dort 



ungestort von Freitag bis Montag zu schlafen* Durch 



Ostfenster 



seines kleinen Zimmers drang das Tageslicht herein. Er nicht 

k 



recht, Ob ihn die fahle Helle geweckt hatte 



“ 2 " 



das Gegacker der 



iSihner, die sich unter den PfirsichbUumen 



schaXfen machten. Er 



htSrte, wie sie, eines nach dem 



, von ihren Schlagplatzen im Baum 



zur Erde flatterten. Im Tarbte sich der Himmel mit gelbroten 

S 



Streifen, vor denen 



die Itorisse der Fichten standen. Die Sonne 



zine im bereits merklich fruher auf . Es konnte noch nicht sehr 

lo 



IT 



sein, und es war angenehm, aufzuwachen, ehe ihn die 



U 



rief » Geniesserisch drehte er sich noch einmal auf die 



Seite. 



13 



Das trockene Maisstroh seines Bettes raschelte unter ihm* 



leuchtenden Streifen im Osten mirden heller, Ein goldener 



W 



Strahl 



die Spitzen der Fichten, und w^end er noch hinsah^ 



15 



sich die Sonne selbst als gluhender Ball, Wie von 

15 



TTachsenden Licht aus Osten herbeigeblasen ruhrte sich ein Ldftcheni 



IF 



Rupfenvorhange wehten ins Zimmer, Nun strich die Brise Uber Bett 



und glitt ireich, vde eine streichelnde Hand, tiber Jody . . _ 

so warm , , , • 20 



• Es war noch 



* V 















APPENDIX D (cent*) 

INSTRUCTIONS FOR PilRT II 

On th© pages 'that follow are words and parts of words# The first page 
has examples 11 letters long; the second one has examples 7 letters long# 

An example does not necessarily begin at the beginning of a word or end at 
the end of a word, and the examples are completely unrelated to each other. 

Here the middle letter has been replabed by a blanks try to put a 
letter in the blank that makes sense. Besides the ordinary letters of the 
German alphabet, there is one other symbol you may uses idiich stands 

for the space between words* 



Sample Problems 


Answers 


JED*tjOCHB* 


JEDBWOCHE« 


ACHSTJH*KM 


ACHSTM^m 


INMAL^SPOTT 


INHAl£SPOTT 


*BSJCLO 


^ESj^KLO 


mo 


iM*MIT»t 


HEUJBJtU 


HBUTE*U 

ana 



• I ^ 

Do not look at the next page until ybtir instructor tells you to begihi 
liVhen you finish one page, wait until you are told you may go on to the nexti 
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APPENDIX D (cont.) 
= G lib 



^OGLICHB^ZEIGHBN 

^^AABCDEFGHIJKLMNobPQRSTUUVWXIZ 



1. 


UCH*F^HRT*E 


2. 


N * M U S ^ 


__^E N * S I 


3. 


IND*D_S*SCH 


1*. 


R * J U N , 


^E * K A P 


9. 


N*HAN_ELSMA 


6. 


♦FLEH^NTLIC 


7. 


E R K L A . 


__ T E * S I 


8. 


E N * S E 


^ H S * S T 


9. 


R S T E * , 


__ 0 R G E N 


10. 


* G R A U 


_ E * G L I 


U. 


M E E R 


_ H N A U 


12. 


VERSO 


_ W I N D E 


13. 


D I E * M 


_ N S C H E 


11*. 


G £ H 0 L 


__ E N * H A 


19. 


A U F * I 


^ M E R B 


16. 


0 C H E * 


_ T A N D * 


17. 


W A R * I 


_ R * D A S 


18. 


U N D L I 


_ H E * M I 


19. 


G £ N 0 M 


E N « U N 

mm 


o 

CM 


ICKT* 


_ A S S » I 


21. 


U S C H T 


_ HI U R D E 


# 

CM 

CM 


R * D I E 


_ S C H M U 


23. 


D U N K E 


_ H * S I R 


2U. 


E R * D E 


_ * D I E ♦ 


29. 


C H T * V 


R S A N K 



E R L E 
BLASS 
R * N A C 
N K L E * 

E T R A T 
ANDES 
32* RAUME^BEHER 
33. HRE*I_RFAHR 
3k* MISSI_N*IN* 

35. *AUPH_ELT*E 

36. GEGEii^*DA*S 

37. NGEN*_EISE* 

38. N*IHR_EINEN 

39. HRER*_IT*SI 
1*0* DETE*_ND*V0 
1*1. I GEN*_LTEN* 
1*2. INEM*^ONAT* 
1*3. ERNKL^IDER* 
1*1*. #LETZ_E^^STR 
1*9. U*BEW^LTIGE 
1*6. CHMIT_AG*UM 
1*7. I T T E N _ W A R E N 
1*8. MAULT_ERE*Z 
1*9. MIT*S_INEM* 
90. R*INS LAND-lk 



26. * B I S ■«* 

27. E L * V E 

28. * I N * D 

29. D E R * D 

30. * T A G * 

31. * D E S * 
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APPENDIX D (cont*) 
« 0 7b 



MOGLICHE*ZBICHEN^^ 

♦AABCDEFGHIJKLMNOOPQRSTUUVWXYZ 



1* 


D B S _ V 0 L 


26. 


V 0 M _ S T A 


2. 


L E N U N 


27. 


S C H _ S * F 


3» 


L E G _ N * S 


28. 


B E I _P E I 


h. 


U N T _ R T A 


29. 


L S * _R * A 


s. 


D A U R W 


30. 


SICKERT 


6. 


GESTALT 


31. 


I S C _E * V 


7. 


U N T _ R * D 


32. 


* A L _ E M * 


8. 


S E I _ * U N 


33. 


G E K _S E I 


9* 


0 N D _ S T E 


3U. 


B A L _ T E # 


10, 


F E N _B R Z 


39. 


C H E G L 


!!• 


s S * _ U * S 


36. 


B R U_ A L E 


f\> 

• 


E S E ^ E X 


37. 


0 R T _ 0 D 0 


13. 


N E * _ E S T 


38. 


I N * _E S E 


il*. 


SOLUTE* 


39. 


I C H_*D E 


19. 


C H E L A 


hOm 


N D * _ C H 0 


16. 


R * D _ R * F 


iil. 


BE N _GE B 


17. 


I N * _E N S 


Ii2. 


W A H _ E » D 


18. 


E L L _ S I G 


k3. 


U M * _ H N * 


19. 


A N N_N A C 


lik* 


0 P F _R * G 


20. 


* D A ^ * G L 


16. 


WELCHES 


21. 


E R P __ E S S 


h6. 


S T E _ L U N 


S 

04 

CM 


T E * _ I K T 


1*7. 


W W E 0 N 


23. 


R * I _ N * K 


1*6. 


* E I _ E * K 


2U. 


E R * _ E I N 


1*9. 


G * J ^ H R E 


29. 


E R * _ I E D 


90. 


Z A H ^ R * 0 






'■wg g^ ' igs! 









,er|c^ 

MlIliffilffTlTliiU 
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Appendix page 12li« Sanple Scraoibled Test 



!• made 


6. to 


U. 


importance 16. 


the 




2. it 


7. language 


12. 


of 


17. 


parts 




3* human 


8. long 


13. 


Should 


18. 
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Appendix C, pages 126-127. French Yn'ord-Cloze Passages 
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Appendix C, pages 129-130* French Letter-Cloze Items 
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Appendix D, pages 132-133* German Word-Cloze Passages 
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Appendix Dj pages 133-3-36* Qenaan Iietter— Cloze items 
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